mobileOK ref implementation F2F, day 1 -- 12 Jun 2007

Goals for the F2F

Jo: we made quite good progress

... there is already quite a lot of code

... I think we ought to think of several things

... code quality (esp. mine!)

Jo: acceptance criteria of the BPWG as a whole to endorse our work
... probably a combination of test suites that shows that it has reached a reasonable level of quality
... which means we need to think about our testing period
... We probably should also do an audit of how much work there is left
... esp as summer is coming up, and people (incl me) will be taking time off
... Overall, really happy about the progress of the group

abel: a few points that will need to be discussed based on our work
... caching behavior of the checker
... how to deal with CSS? We haven't picked a CSS library yet
... we'll also need to define an exception hierarchy
... also need to work on how to handle messages that come from third party libraries

Jo: indeed, third party messages will be tricky to handled
... e.g. for I18N
... We'll certainly have to revisit this subject
... TagSoup doesn't look too hard to modify

Abel: not all libraries provide error codes

Roland: I hope that we'll get to better define the results document
... would like to make sure it's easy to check a page against the checker: providing examples, ways of fixing code, etc.

Jo: is this in scope for our work?
... MobiReady checker has this, from .mobi
... CTIC could do the same
... I don't know we need to develop this in this group
... it's probably good that each implementation can add what's needed on top of the common answer all the tools should give
... I don't think it's in scope for this group

Roland: we have to define unique ids to allow other tools to build on top of ours, then

Jo: indeed; it is similar to Abel's concern re error codes
... The question is how much information do we need to provide to make this possible

Nacho: I think we need to map error codes and messages to an unambiguous code provided by moki
... hopefully with a declarative approach

Roland: I think a minimal set of error codes and examples would be useful
... we're not interested in providing a mobileOK checker, but we want to make sure our system produces mobileOK pages
... that's why we're interested to make sure there is a single mobileOK answer

Ruadhan: I would like to get a clear plan what we still need to work on
... also, the current XSLT framework makes it hard to report several failures

dom: error message is going to be tricky and we need to use the time to solve that here rather than on this phone
... should be a priority for the meeting
... also set up test suite and it should be done sooner rather than later,
... I would like to help with the XSLT too

Jo: another concern I have: we need to pay attention on licensing, copyright notices, etc
... we need to make sure for instance that Jhove's license is compatible with our usage
... we may need to engage some legal help at some point
... [summarizing the additional points suggested for the agenda]
... * error reporting, error codes and I18N
... * test suites, and acceptance criteria, beta period
... * licensing and IPR
... * issues with XSLT Framework, standardization of output format
... * CSS Library
... * Exceptions hierarchy
... * Cacheing behavior

Roland: another question: mobileOK refers often to the HTTP Request headers in 2.3.2, but the list doesn't seem to be complete

Jo: It is actually complete
... this was under discussion on the public mailing list recently, btw
... interesting points that were not made before

-> http://lists.w3.org/Archives/Public/public-bpwg-comments/2007AprJun/0033.html Laurens Host on Accept header in mobileOK

<roland> http://www.w3.org/2005/MWI/BPWG/Group/Drafts/mobileOK-Basic-1.0-Tests/070520#test_objects_or_script refers to 2.3.2 HTTP Request headers

Dom: re HTTP Accept, we should look at what most current mobile browsers do
... if most do as Laurens suggest, we should probably that approach
... but I don't believe mobileOK basic should mandate one way or the other in any case

Jo: right; I think HTTP certainly doesn't forbid to send the whole thing for each request

Roland: [discussing the reference to http request headers in objects_or_script in mobileOK Basic]

Jo: the tests are indeed a bit inconsistent in terms of the reference to the accepted types
... that said, I'm trying to keep the changes as small as possible

-> http://www.w3.org/TR/xhtml-modularization/dtd_module_defs.html#a_module_Object DTD of object module, showing that "type" attribute is not required on <object />

Dom: note that the type attribute is not mandatory, re test on objects_or_script

<scribe> ACTION: Jo to get back to BPWG on objects_or_script with regard to type attribute on object element [recorded in http://www.w3.org/2007/06/12-bpwg-minutes.html#action01]

<trackbot> Created ACTION-504 - Get back to BPWG on objects_or_script with regard to type attribute on object element [on Jo Rabin - due 2007-06-19].

licensing and IPR

Jo: the first thing to check is whether we can redistribute according to our license
... to do so, we would need to call on legal advice
... probably should be done by W3C legal team

Dom: seems fair, as long as we provide them with a definitive list of packages and licenses

<scribe> ACTION: Dom to give a heads up to legal team re IPR, check if there is anything we should carefully avoid [recorded in http://www.w3.org/2007/06/12-bpwg-minutes.html#action02]

<trackbot> Created ACTION-505 - Give a heads up to legal team re IPR, check if there is anything we should carefully avoid [on Dominique Hazael-Massieux - due 2007-06-19].

Jo: Another question we need to ask ourselves is: what do we want? :)
... obviously we want all the usual open source stuff: free to use, redistribute, with preservation of credits and licensing
... do we want to allow people to modify it?

Dom: if we want to call it open source, we have to

Jo: my only concern with it is it would allow people to change the code and provide modifiefd mobileoK checker services

dom: we can protect this through the trademark on W3C mobileOK basic
... not sure we can enforce the usage of the ref implementation; most likely, should we enforce anything, it would be based on a conformance test suite
... much like Java
... still, we should allow people to re-use our code (e.g. for other libraries), propose bug fixes and so on

Jo: test suite sounds good
... we want one for our own needs, anyway
... we would need some versioning with it, obviously
... I guess the BPWG will need to look at the question of licensing of mobileOK

Dom: for the checker, I think the working group will need to endorse a test suite as a reference test suite

Jo: we probably need legal advice on how to make sure the words "mobileOK checker" is protected

Dom: it's already protected through our trademark, although we'll certainly need legal advice on how to put it in words
... I think the W3C Software License would cover most of our needs
... W3C has also a process to allow for external contributors
... I guess contributions could be sent to our mailing list (public-mobileok-checker)
... provided we keep maintaining the code - but W3C, CTIC and dotMobi have their on-line services based on the library, there is no reason to worry for the foreseeable future
... the key question would be the maintenance process of the conformance test suite for checkers
... would probably need some input for the BPWG as whole

Jo: what about credits?

Dom: two different things: credits in the code, vs credits in distributed binary versions
... Requiring credits for binaries that use our code would make our license much less OSS-friendly

Jo: we should check each with our own organizations

<scribe> ACTION: Jo to check with dotMobi that W3C Software license is legalOK [recorded in http://www.w3.org/2007/06/12-bpwg-minutes.html#action03]

<trackbot> Created ACTION-506 - Check with dotMobi that W3C Software license is legalOK [on Jo Rabin - due 2007-06-19].

<scribe> ACTION: Ignacio to check with CTIC that W3C Software license is legalOK [recorded in http://www.w3.org/2007/06/12-bpwg-minutes.html#action04]

<trackbot> Created ACTION-507 - Check with CTIC that W3C Software license is legalOK [on Ignacio Marn - due 2007-06-19].

<scribe> ACTION: Roland to check with 7Val that W3C Software license is legalOK [recorded in http://www.w3.org/2007/06/12-bpwg-minutes.html#action05]

<trackbot> Created ACTION-508 - Check with 7Val that W3C Software license is legalOK [on Roland Guelle - due 2007-06-19].

<scribe> ACTION: Dom to check how to make the names of the companies appear in the copyright statement [recorded in http://www.w3.org/2007/06/12-bpwg-minutes.html#action06]

<trackbot> Created ACTION-509 - Check how to make the names of the companies appear in the copyright statement [on Dominique Hazael-Massieux - due 2007-06-19].

issues with XSLT Framework, standardization of output format

<roland> ScribeNick: roland

ruadhan: if there are more errors in one test, it is difficult to handle these errors
... when provide defaults fails, you can only report the first

<dom> miguel: the current code only takes care of the first result element

<dom> ... we probably need to make it possible to report more than one failure

<dom> ... I guess we should calculate from the java code what is the complete outcome of the results

<dom> ... (i.e. if there is any failures reported, the outcome is fail; if not, it is pass)

roland: the problem is the java implementation, there is no problem with the xslt

<dom> ACTION: Ignacio to annoy Miguel until he implements the change to java implementation to deal with reporting more than one failure [recorded in http://www.w3.org/2007/06/12-bpwg-minutes.html#action08]

<trackbot> Created ACTION-510 - Annoy Miguel until he implements the change to java implementation to deal with reporting more than one failure [on Ignacio Marn - due 2007-06-19].

dom: so we can move to the output format

abel: the parameter language is to change for the messages

dom: should we add a settings file for the XSLT?

jo add this to the agenda

<jo> scribenick: jo

Result format

roland: standardization of the output format
... based on format from Sean's initial test - limited and needs changing
... need to change the format according to this example

[roland shows an example]

roland: the test reference is the BP ID
... need a sub-reference to mobileOK test
... within the test each possible FAIL or warn needs an ID

error reporting, error codes and I18N

<dom> (XSLT output format and error reporting are obviously strongly bound)

<dom> [discussing about what the results document should contain, e.g. a copy of moki or not]

Results Format Continued

Sean: minimal document we discussed before lunch was ..

Roland: why don't we copy more info fromthe moki document to make it independent

ruadhan: like code snippets?

dom: well the way to make a choice could be to think about the use cases - I like this apporach, which is simple and keeps the localization in one place
... but the basic question is what we want the results document to be ...
... e.g. to integrate into an authoring tool this is not enough.
... e.g. line col needed to highlight the error

sean: use cases, Authoring Tools, IDE;
... Online Tool like validator.w3.org/mobile
... some comand line utility
... important use case is tool

jo: what about mobileoK accrditation

sean: the last is the least demanding

dom: the tool case only demands the line and column

jo: it doesn't need to be friendly to users just usable by developers so I suggest
... we do all or nothing - the minimum being that we record the results for the tests and some minimal info about where you have failed if you have failed
... or you include the whole molki output and tell developer that if that's not what they want they can alter the code

sean: agree with the result doc lining up with the mok basic doc

dom: what about line col - this is not in the moki doc?

jo: ah, I see your point, but how will we get that anyway

dom: we haven't figured that out yet (sic)

<Zakim> dom, you wanted to ask about how to provide location information with XSLT

dom: the question is how do we actually get than information from xslt which has no notion - that said (TM) I do think we should have the location information, the difficulty is that xslt doesn't tell us where the error is
... for example, if there is a basefont etc. xslt won't tel you where
... not possible in xlst 1 but I don't think you can do it in xslt 2

roland: you could step through the whole document node by node

dom: that would only work if you ...

roland: we are not interested in text nodes so whitespace doesn't matter

jo: I missed the point that you were making roland

roland: select name from each node and create an absolute xpath

dom: need some magic to turn xpath into a line col

sean: need some facotry that allows you to tell for any Dom element where it came from

jo: should we go away and see if someone has solved this problem before?

roland: when we get a fail, generate an xpath to thelement that failed

dom: but what we want to give is line col not xpath

roland: so we match the code in the document and find the code in the source document
... in the post processing ...

dom: but that doesn't work if there are duplicates in the document

jo: let's annotate each line with a comment then the line number is the preceding sibling

everyone: oh god that is terribly hacky

jo: so?

[dom then goes on to explain why it doesn't work as well as being very tacky]

[discussion of how people use SAX to get the information]

jo: I remember that the MS parser does this

ruadhan: yes something in .net and python too

sean: well it can be done ...

... we do need to ask someone to research this

dom: and it needs to be found out sooner ratherthan later

... we could "remain silent" on the line and column no and just give the xpath and leave the problem to implementors

sean: there must be some way to do it ...

dom: we should be open to the idea of replacing the xslt with sax parsing if that is what it takes

... but before that we should action someone to find out

... if we can get this from xpath

ACTION: Sean research annotations in the Dom giving the source line and column number [recorded in http://www.w3.org/2007/06/12-bpwg-minutes.html#action09]

<trackbot> Created ACTION-511 - Research annotations in the Dom giving the source line and column number [on Sean Owen - due 2007-06-19].

ACTION: Roland to research getting line and column information from XPath [recorded in http://www.w3.org/2007/06/12-bpwg-minutes.html#action10]

<trackbot> Created ACTION-512 - Research getting line and column information from XPath [on Roland Guelle - due 2007-06-19].

Roland: want to build a regex to match foo/@bar

dom: doesn't work ...

roland: ok

dom: one option going from xpath to line/col is sax - you do end up parsing again

sean: so assuming we can get the line/col what should we put in the result document?

dom: [explains facing the board hiding what he is writing]

sean: there are 4 ways to elaborate the position:

<dom> subelement or attributes: row/column, headername

... no position (refers to the whole), header, line col and url in addition

jo: don't we need line col in headers too?

dom: well I'd prefer not

sean: if we showed the header value then that would be enough

... would not refer to images etc.

dom: could add a rank to clarify in the case of a duplciate HTTP header

... need to provide the bare minimum to provide useful stuff

... i.e. not code snippets

sean: so how about we miss out the code snippet?

nacho: not needed

abel:if it is an image then we need byte offset

jo: yes, and for character encoding errors

<dom> Pointer methods in rDF

[sean explores various possibilities]

[dom points out that its easier to work with an explicit byte offset]

nacho: it would be better to have an xpath location to tie back to the moki document

... we do the xpath first

... then if we find a way of doing the line and column do it later

dom: Ok, but I think it needs line/col though it might speed up development

sean: agree with Dom - need line col to orginal document

jo: point out that the origianl document and the current resource at the URI and question are not necessarily the same

... so think that if you want to reference you need to reference the moki document

Result Document - In summary

[conversation went off-road at this point]

[discussions on how to get the position of the errors from XSLT]

<ruadhan> line & column from xpath?

<ruadhan> http://www3.telus.net/minevskiy/ivan/project_XPath.htm

[discussions on whether browsers download several times a non-cacheable image loaded several time in a single page]

<dom> random image with no-cache

<roland> http://www.xml.com/pub/a/2004/11/24/py-xml.html, File Locations from SAX

<dom> demonstration of using three times an uncached resource

<dom> <dom> Yves, if an image with a no-cache directive is loaded three times in the same page, should a browser makes three separate http requests, or is it "normal" not to? is there anywhere where this "in-memory" caching would be specified?

<dom> <ChrisL> dom - yes it should

<dom> <dom> my observation is at least that firefox doesn't

<dom> <Yves> there is no "in memory" caching. HTTP specifies no-cache and no-store

<dom> <Yves> <dom> do you have an answer to the first part of my question, Yves? :)

<dom> <Yves> <Yves> not really as it mixes two things, HTTP interactions (in that case, yes it should do multiple requests)

<dom> <Yves> and HTML parsing, where the client can have one reference to external objects and say that all the references to the same object leads to only one in memory

<dom> <Yves> in that case it is valid to generate only one HTTP request

<dom> <Yves> so the right answer for your question is "undecided"

<dom> <Yves> unless there is a processing model defined for HTML :)

<dom> (new tests in http://www.w3.org/2007/06/test-html-no-cache.html , include hex-encoded URIs which at least Firefox makes a separate request for)

<dom> (and my Blazer User Agent doesn't normalize port :80 mentions, while Firefox does)

[massive digression on tidying the character encoding]

<dom> "You can also supply an AutoDetector that peeks at the incoming byte stream and guesses a character encoding for it. Otherwise, the platform default is used. If you need an autodetector of character sets, consider trying to adapt the Mozilla one; if you succeed, let me know." http://ccil.org/~cowan/XML/tagsoup/

<dom> IconV package in Java

[resuming discussion of what needs to go in the minimal document]

We resolved ref tidying as follows:

(All designed in a way that allows others to contribute trial decodings)

1. Get the document

2. Determine stated character encoding - look at in order content-type, xml dec and meta

2a if it claims to be UTF-8 tidy if necessary by inserting replacement char

2b If it claims to be something other than UTF-8

Try converting it using appropriate tool

2c If it makes no claim - treat it as a utf-8 byte stream and add replacement chars where necessary

3. HTML Validate with xxx

3a. if invalid tidy

The tidied document is hence either character encoding tidying or HTML tidying and the tests need to point out that they are working on tidied versions where they are.

Dom said he'd give implementing it a shot.

ACTION: Dom to give implementing character encoding thingy a shot [recorded in http://www.w3.org/2007/06/12-bpwg-minutes.html#action11]

<trackbot> Created ACTION-513 - Give implementing character encoding thingy a shot [on Dominique Hazael-Massieux - due 2007-06-19].

We resolved that the result document format was as follows:

one position per result

if necessary repeat result for each error

<info> message is parameterised

<roland> example of what a results document looks like

ACTION: Sean to implement the above results format and codify in EARL if possible [recorded in http://www.w3.org/2007/06/12-bpwg-minutes.html#action12]

<trackbot> Created ACTION-514 - Implement the above results format and codify in EARL if possible [on Sean Owen - due 2007-06-19].

mobileOK ref implementation F2F, day 1

12 Jun 2007

Attendees

Contents