mobileOK ref implementation F2F, day 2

13 Jun 2007


See also: IRC log


Abel, Miguel, Ignacio, Sean, Dom, Roland, Ruadhan
dom, ruadhan




<dom> Minutes of Day 1 (June 12)

<dom> Third Party study from CTIC guys

<dom> (the latest version of that document on Google Docs is dated June 12)

<dom> ScribeNick: dom

Using errors reported from other tools

Sean: I think we need to take an ad-hoc approach, do our best with what the tools provide, and if they don't, find our ways around

Abel: one of the problems we identified is that most of these tools don't provide error codes
... for instance, the XHTML module in JHove only outputs a message and a location
... the image module provides a message and a bytes offset

Sean: I think it's fine for us to parse the error messages; it's ugly, but probably shortest way forward
... would be better if they provided a better API

Miguel: the difficulty will be to identify all the possible messages
... in Jhove, they are hardcoded in the source itself, not even in property files
... and of course, the messages are parametrized (e.g. to include the name of the element that triggered the validity error)

Jo: if we have to review all the errors triggering code, we may as well fix it!

Sean: one way to get around that is simply to include the messages that Jhove sends to us in the error messages we send back to the user

Dom: but that's a killer in terms of I18N

Sean: right, that's the big downside
... but I would still favor just doing it

jo: I guess the question is whether perfect error reporting is part of our requirements or not
... we need to strike a balance between what we would like to achieve, what we need to achieve and what we can achieve in a reasonable amount of time

Sean: I think we should focus on what we have to do first

Jo: whatever we decide, we just need to make clear whatever the restrictions our first version will have
... too bad libraries don't handle this well

Dom: I note *our* library will have exactly this same problem given the decision we took yesterday (non-parametrized error messages)

Sean: I think we should proceed with the simple solution for now, and fix it later

[discussions on whether we should favor a pragmatic vs esthetical approach]

Dom: what you guys are doing in your tools?
... The checker just sends back the messages the XML validation library produces

Ruadhan: same for us

Miguel: in TAW, we had to hack around the XML validation to get translated messages
... The CSS library allowed for localization, so we didn't have the same problem

Sean: let's have a better system as our goal for version 1.0, but move forward with the simple version now

Miguel: if so, we should probably separate the data in the results
... so that it's clear that some part of the messages aren't produced by our library

Sean: sounds good, indeed
... so we amend the results document as we discussed yesterday to create an additional element (e.g. "details") to include third party library messages

<Zakim> dom, you wanted to propose that we have somewhere a reference results document so that we can now at any time the expected structure of the results doc

Sean: point taken; I guess the reference would be what is in the test suite

Jo: random thought of the day: should the individual test reported by the XSLTs have a specific version number attached?

dom: don't think that's necessary
... let's wait until we would actually need it

Jo: also, we need to look at how to report errors from the library
... e.g. out of memory errors

Sean: don't think that needs to be part of the results document
... just throw an exception

Jo: I think the results document should give some indication of this
... e.g. with a CannotTell

Dom: I think both approach are reasonable
... the only question is whether exceptions get handled in or out of the library
... I think the difference is whether you consider the API to be the Java API or the XML API
... don't think we've ever made a clear decision on this

Jo: : we should probably move on for now, but we'll need to get back to this

Ruadhan: if the results document is a report, it should always report whether a test passed or failed, it can't be silent on it

dom: if we can solve this with just another wrapper to catch exceptions, it's probably worth keeping the exceptions, as this gives us the best of the two worlds

sean: doesn't seem very clean

dom: I say, let's keep the Java API clean, and how exceptions are handled can be decided later on, or even by a wrapper library should the need arise

sean: still not convinced, but we should move on

test suites, and acceptance criteria, beta period

Sean: we already have a set of unit tests, which hopefully we can use to convince the BPWG that we do indeed implement mobileOK Basic

Jo: one of the questions is what part of the results document constitute a proof that your checker is indeed a mobileOK checker

Sean: clearly the error messages shouldn't required
... I guess it should be that you do report the right errors

[discussions on protecting mobileOK checker through test suite]

Jo: still, we need to make sure our test suite is complete

Sean: I say we add tests as needed

Jo: we want to keep in mind that the test suite will need to be versioned

Exceptions hierarchy

Abel: currently we only have one type of Exception
... we could have two kinds of Exceptions
... to distinguish Fatal Errors (e.g. config file not found) vs exceptions raised in the test execution (e.g. exception raised by Jhove)
... (this relates to our earlier discussion on error reporting)

Sean: so the question is whether we want to subclass TestException
... my take is someone using our code wouldn't care about what type of the exception
... I guess we could chain exceptions if we do want a hierarchy
... I don't oppose having a hierarchy if there is a use case for that

Abel: if one test failed because of of a failure of a third party, what is the result?

Sean: that's indeed the question we just discussed
... do we report it or not?
... I guess Dom and Jo argued for outputting a minimal results document with a CannotTell message

dom: I think we were actually asking for a document as complete as possible (i.e. including the results that were indeed processed)
... and also some information as to why one or more of the tests couldn't be run

Nacho: not sure we need a cannot tell, since it's not defined in mobileOK
... a warning should probably be enough

Sean: I still think we should get back to that later
... if we keep exceptions, I think the current flat exception space is ok, although I'm open to expand it if use cases suggest it
... if we report cannottell outcomes, I don't think we should raise exceptions at all

CSS Library

Nacho: I think we should decide what library to use

Sean: so, in our choices, one was good at syntax parsing, and the other @@@

Miguel: another point to consider is how to turn the CSS style sheet into XML if we want to process it through XSLT

Jo: I'm not quite sure what we should do here
... I'm tempted to only integrate the error reports from the library in the moki document instead
... the library that turns CSS into XML has too many flaws for our own use
... and it would probably be out of our scope to develop such a library at this point
... (although it would certainly be nice to be able to do so)
... The best option is probably to with the SAX CSS parser, since it's the most likely to work for our purposes

Sean: so we need to both validate and analyse the style sheets
... is one library enough or do we need two for that?

Miguel: the SAX parser can only be used for analysis of the style sheet
... the only library I know to validate a style sheet against CSS 1 is the W3C CSS Validator

Dom: another option is to use the SOAP interface for the CSS Validator
... (although it prevents to use our system as an all-in-one package)

Sean: I'd rather keep it in all-in-one
... so can we use the css validator code for our purposes?

Miguel: yes; the only problem is that it is a bit slow

Nacho: I think it's probably good enough for our first version of the checker

Jo: this raises the point that we should have a wrapper for our validation code
... so that it's easier to swap validators if we choose to
... identifying a common interface around these validators would be a good way to identify what we want out of these validators anyway
... our current code is ugly

Sean: I'm personally fine with binding directly to the library
... it also obscures less the code
... and it allows a greater use of the underlying API

Jo: true... probably a matter of taste

Sean: also, we're already using well-defined API (SAX, XML validation, etc)
... and if we were to change a validator, I'm not sure an abstract API would actually save us so much time

Jo: I don't disagree with you
... it would certainly be helpful to have a common interface for validators, that said

dom: note that the W3C Unicorn project has more or less defined such an API, if you're interested

Jo: sounds interesting; anyway, it sounds like we're not going to proceed that way for the time being

Sean: here is what I think we should do:
... we should remove the JXCSS thingy I had started
... we use the W3C CSS validator for validation
... and SAC for the actual test implementation

Jo: one of the difficulties is to deal with inline CSS

Sean: do we allow the style attribute?

Dom: we do

Sean: so that will need to be implemented
... fortunately, our tests on CSS are fairly simple (e.g. don't use "px")

<scribe> ACTION: Ignacio to work with Miguel and Abel to implement the CSS stuff (removing JXCSS, implement validation, and use SAC for test implementation) [recorded in http://www.w3.org/2007/06/13-bpwg-minutes.html#action01]

<trackbot> Created ACTION-515 - Work with Miguel and Abel to implement the CSS stuff (removing JXCSS, implement validation, and use SAC for test implementation) [on Ignacio Marn - due 2007-06-20].

Jo: I still think we should keep as a goal to have at some point an CSS-in-XML implementation in moki

Dom: how hard would it be to use SAC to generate such a thing? should be relatively straightforward, isn't it?

<scribe> ACTION: Jo to evaluate how hard it would be to produce XML out of CSS stylesheets using SAC [recorded in http://www.w3.org/2007/06/13-bpwg-minutes.html#action02]

<trackbot> Created ACTION-516 - Evaluate how hard it would be to produce XML out of CSS stylesheets using SAC [on Jo Rabin - due 2007-06-20].

Cacheing behavior

Dom: the question is what caching should our library do?

Jo: indeed, what do we cache and under what circumstances? esp. given what we discovered yesterday re caching and URIs

dom: two caching questions: keeping a list of URIs already downloaded in the given request vs keeping a resource that was downloaded for a previous analysis so that you don't have to do it again

Jo: I think we shouldn't do the latter, and should do the former
... we also need to discuss what to do with regard to URIs given our discovery of yesterday

Dom: [different cases of what browsers do in terms of canonicalization]

Sean: think we should keep it simple (i.e. simple string comparison), and that should be pretty close to what current browsers do
... unlikely to happen very often anyway

Dom: only thing we have to do for sure is making URIs absolute

Ruadhan: I'm willing to take an action item to implement this

<scribe> ACTION: Jo to annoy Ruadhan until he implements the in-memory caching per URIs [recorded in http://www.w3.org/2007/06/13-bpwg-minutes.html#action04]

<trackbot> Created ACTION-517 - Annoy Ruadhan until he implements the in-memory caching per URIs [on Jo Rabin - due 2007-06-20].

"If a Mobile Web site adapts in the forest and no user agents are there, is it OK?" -- DanA

Documentation: schemas, introduction, ...

<scribe> ScribeNick: ruadhan

sean concerns what we are going to provide in end product

jo: and schemas

Sean: introduction, overview

Jo: talked about yesterday
... don't consider ouseleves finished until we have proivided certain amount of documentation we need to decide what this is

Sean: ... bug tracking?

Dom: lets use w3c bugzilla
... don't like bugzilla necessarily, we could use tacker

Roland: need docs in xslt?

Sean: we should have comments in source also
... what else do we need?

Jo: nacho mentioned developer guide

Sean: we should take resolution not to finish until documentation is finished

Jo: Problem with frameworks can be lack of documentation

Sean: thats what developer guide will be about
... schema can be done later

Jo: worth doing now

Sean: will take user guide, de. guide,
... we all need to do javadoc and comments
... Jo will take schema and homepage

nacho: i will take dev guide and user guide

Jo: started homepage in adhoc way -
... if anyone wants to contribute, feel free, just need write access

Sean: homepage is real nice
... need a mobile-friendly version

Jo: might be ok

Sean: anything else?

setting up configuration framework (e.g. for language setting, authentication, ...)

miguel: what about authentication parameters, locale...

Sean: authentication mentioned somewhere, how do we support?

Jo: I would like to have some minor configuration options
... e.g. doing our own redirection handling, but might be nice to use standard commons redirection

Sean: so theres a class of development only options
... my concern is that mobileOK should mean one thing and be one thing
... its mobileOk or not

Nacho: what about validating local doc that you can upload instead of just passing URL

Jo: some kind of desktop integration would be nice

sean: configure a local directory that acts as a pseudo webserver
... right now we already have something like this in the code
... test docs that have a document specifying test headers and starts tomcat
... might be nice to be able to test localhost

Jo: more will come out but we just need a single approach

Sean: i can name many mechanisms:
... config file, xml or properties, command line / env variables

Nacho: verbosity level

Sean: of log statemts of code?

Nacho: granularity of results document

Sean: the results doc should be the same all the time for consistency
... but maybe we do want a quick mode: just passes and fails

Nacho: could be quicker if the framework is not figuring out lines and cols etc.
... was thinking about results doc to save time in processing

Sean: how should we store this stuff

Roland: set paramaters in web interface - quick mode or developer mode
... some config params can be set by user, there are options per request and per the whole thing

Sean: config file appropriate for globale options
... per request, maybe a java class

Jo: how do we make globally accessible

Sean: within the code we could use some kind of singleton, get an instance of the congiuration object
... or configuration could live in an object within the tester

Jo: but how to access it

Sean: yeah, without passing it all over the place
... its doable...

Jo: i don't care how its done, is this something someone can take on?
... and what is our approach to logging?

Sean: I suggest java.util.logging

Jo: many approaches

Sean: preference is for java.util.logging, they all pretty much do the same
... when to log "fine" "info" "warn"
... recap: we've identified enought that we need a mechanism
... some of these features for development more than anything
... and what about the global config
... it think i can solve the global problem
... last question is how do we get the global options in, config fuile, command line options?
... I'm happy to do this
... global config in a file, and a class encapsulating the options
... lets talk about loggin some more later
... can use our judgement about when to log fine, warn, info etc.
... Any other requirement?

migeul: what about example when a page includes a reference to an image with size 1MB
... do we download it, or do we set a limit?

Sean: yeah what about files that are 10MB, or 100MB

Jo: yes this has been on my mind
... the fact that we build a DOM in the first place is fundamental and at the heart of this issue

Sean: one solution is to
... in the retrieval is if the doc > 1MB just cut if off and call a network error
... just to protect against malicious attacks

Roland: do we limit number of resources?

Nacho: we should have some hacking session, we try to break it

Sean: what do you guys do?

Dom: in checker there is a limit of number of redirects of 5
... checker doesn't follow link to itself
... Can only run it on its homepage

ruadhan: same for ready.mobi

Dom: i limit number of links
... i don't limit the size of resources

Jo: need to both count the redirects and check they are not circular

Sean: lets call it "safety hazards"
... redirects
... #links
... resource size (DOM, images)
... links to self
... stalled requests, timeouts

Jo: what about if someone is using you as a proxy for DOS attacks

migeul: if someone uses us as DoS, its not efficient enough as its not a fast process

Sean: lets revisit this at next F2F

audit/estimate to completion

Sean: ok, where are we?
... the goal is to get to something that looks like an alpha in early July
... what do we need to sort out in the next 4 weeks
... lets recap the actions

Jo: css stuff needs to be done urgently

Sean: actions 505 to 508 don't seem critical

Jo: probably need a written document saying "yes its ok for me to contribute"

Sean: action-510 (multtiple results per test is critical)
... action-511 is critical (research annotatsion to DOM for line and col)
... action-512 - should figure this out soon (line & col from xpath)
... action-513 - critical also (character encoding thingy)
... my intern & I will take this one
... action-514 needs to be done soon (implement results and encode in EARL if poss)

Jo: action-516 needs to happen before 515 (both about css...)
... do we need ownership of portions of code

<dom> [I just plugged tracker so that it will also watch mail sent to public-mobileok-checker, so that we get e.g. action items referenced from the Web interface]

Sean: was hoping that these things would be done as needed
... e.g. if you need something in moki you would add it
... on target

dom: one question is which test do i take to implement?

<abel> http://docs.google.com/Doc?id=dgh5r6zs_5cb7gz3

Roland: i have my name on a number of tests, thats ok!
... i'm not so good in Java

Sean: i'll work on non-test stuff for now

<dom> abel, nacho, could you add me to the list of authorized editors for that doc (dom@w3.org)?

Sean: alot of work here is writing test-cases...
... we'll continue to use google doc to coordinate this

<dom> [I just got access through Jo, thanks!]

Sean: goal by 1st week of july is something that kind of runs, and produces meanigful output

Dom: i tried to run tester and used option to output separetly the results - is it just me?

Sean: I run the unit tests, and right now, at least one fails

Dom: command line runs, but just not doing what I expected

Jo: i thought it was working - it was me that put that command line stuff in, so there's a good chance its not working!

Sean: that concludes our list of items
... is there anything else we haven't talked about?

<dom> [I just found what I was doing wrong with the command line, sorry for the noise]

Jo: might be worth doing a code review

<nacho> ACTION: Ignacio to create a preliminary version of mOK checker User Guide and Developer Guide documents [recorded in http://www.w3.org/2007/06/13-bpwg-minutes.html#action05]

<trackbot> Created ACTION-519 - Create a preliminary version of mOK checker User Guide and Developer Guide documents [on Ignacio Marn - due 2007-06-20].

code reviews

<dom> Code Source in CVS

<dom> the actual Java classes

<dom> the actual Java classes

<dom> [Jo, doesn't javadoc reacts to @todo rather than TODO?]

<dom> AbstractXSLTTestImplementation.java

<dom> ThirdPartiesMessageUtils.java

<dom> ValidationMessage

<dom> (based originally on http://dev.w3.org/cvsweb/2007/mobileok-ref/src/org/w3c/mwi/mobileok/basic/XHTMLValidationErrorHandler.java from ruadhan)

<dom> HTTPXHTMLResource

<dom> HTTPResource

<dom> HTTPRedirect

<dom> Apache commons httpclient.URI

<dom> comparison between apache commons URI and java.net URI

<dom> TestResults

<dom> Preprocessor

<dom> ScribeNick: dom

Jo: XSLT tests developer should pay attention to the normalization of HTTP headers
... Preprocess::addHeader uses HeaderParseMethod to take care of the normalization
... the categorization of parsing modes made in there is based on the RFC

-> http://dev.w3.org/cvsweb/2007/mobileok-ref/src/org/w3c/mwi/mobileok/basic/xslt/ XSLT used in mobileok ref

-> http://dev.w3.org/cvsweb/2007/mobileok-ref/src/org/w3c/mwi/mobileok/basic/xslt/NonTextAlternativesTest.xsl?content-type=text/x-cvsweb-markup NonTextAlternativesTest.xsl, by Roland

-> http://dev.w3.org/cvsweb/2007/mobileok-ref/src/org/w3c/mwi/mobileok/basic/xslt/functions.xsl?content-type=text/x-cvsweb-markup XSLT Utility libraries

Roland: I try to only use match/apply-templates, no for-each
... makes it easier to deal with getting serveral failures
... for each of my test, I show the actual text of the test
... I have a script (moki) that allows me to run the XSLT against the tests in the test directory
... [explains some of the utility functions in functions.xsl]

[discussions on how to present the code-snippet, on a text vs nodes basis, and what can actually be achieved in XSLT]

Proposed dates for September F2F: 4th and 5th in Sophia Antipolis

[code review from the CTIC gang]

We note that XHTML Basic allows for URIs in its list of tokens, but we probably want to limit to the well-defined values

<nacho> [i know you die to know more about bopomofo, so... http://en.wikipedia.org/wiki/Zhuyin ]

Summary of Action Items

[NEW] ACTION: Ignacio to create a preliminary version of mOK checker User Guide and Developer Guide documents [recorded in http://www.w3.org/2007/06/13-bpwg-minutes.html#action05]
[NEW] ACTION: Ignacio to work with Miguel and Abel to implement the CSS stuff (removing JXCSS, implement validation, and use SAC for test implementation) [recorded in http://www.w3.org/2007/06/13-bpwg-minutes.html#action01]
[NEW] ACTION: Jo to annoy Ruadhan until he implements the in-memory caching per URIs [recorded in http://www.w3.org/2007/06/13-bpwg-minutes.html#action04]
[NEW] ACTION: Jo to evaluate how hard it would be to produce XML out of CSS stylesheets using SAC [recorded in http://www.w3.org/2007/06/13-bpwg-minutes.html#action02]
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.128 (CVS log)
$Date: 2007/06/13 16:43:54 $