W3C

– DRAFT –
Invisible XML community group

15 February 2022

Attendees

Present
-
Regrets
Tomos Hillman
Chair
Steven
Scribe
cmsmcq

Meeting minutes

Review of actions

Tom's action about the repo is done.

Steven's action to draft wording on #24 is done.

Status of implementations

NDW has written a Saxon extension function and should announce it this week.

JL, BTW: nothing to report.

MSM has made progress: his test harness is now finding more errors in the test case than in the test harness.

Status of testing and test suites

MSM working on a new collection of tests based on grammars collected in 2018/2019

SP: I've converted a grammar of Pascal to ixml.

NDW: I'll make a test case out of my grammar for Java property files.

Property files are simple at first glance, but there are a lot of complications relating to continuations, escaping, and so on.

The description is very procedural, so it has been a challenge to make a declarative version.

JL: would it be helpful to have a number of test cases that are big? To test size limits, for example.

NDW: Yes.

SP: Pascal has been interesting, in part because the published grammar is ambiguous and has unreachable rules.

The ambiguity is due perhaps to Wirth's assumption that the lexer would make appropriate choices.

MSM: I'd propose that we make a collection of grammars of common notations.

JL: like Gunther Rademacher's.

The collection of grammars for REx is very useful.

MSM proposes ixml/grammars directory

ACTION: MSM to make an ixml/grammars directory for grammars of published notations.

Issue #24 prefix recognition

<Steven> https://lists.w3.org/Archives/Public/public-ixml/2022Feb/0149.html

<Steven> https://lists.w3.org/Archives/Public/public-ixml/2022Feb/0149.html

BTW: Is the clause "determining in the process ..." necessary, or entailed by the word "parse"?

We discussed the proposed text point by point.

ISSUE: Should marking the root symbol with an "@" be an error

JL: is there ever a case in which ixml:state="..." would be part of the prescribed output?

MSM: right now, we cannot generate namespace-qualified attributes at all, so it's an out-of-band signal and no ambiguity is possible.

If and when we add namespace support, we can reserve the ixml namespace.

NDW: or another signal could be devised.

msm: perhaps "if the parse failed, but a parse exists for a proper prefix of the input, ..."

<Steven> * Processors must parse the input using the grammar.

<Steven> * If the input is unambiguously described by the grammar, the resulting parse tree must be serialized to an XML document.

<Steven> * If more than one parse tree describes the input, the processor must serialize one of them. It is not defined how this choice is made, but the resulting parse should be marked as ambiguous by including on the document element of the serialisation the attribute ixml:state="ambiguous".

<Steven> * If the parse fails, the processor must produce some XML document with ixml:state="failed" on the document element, with helpful information about where and why it failed; it may be a partial parse tree that includes parts of the parse that succeeded.

<Steven> * If the parse succeeded, but without consuming the entirety of the input, processors may choose either to produce a failure document as described above, or to serialize the resulting parse tree with the attribute ixml:state="prefix", or if the parse is ambiguous ixml:state="ambiguous prefix".

<Steven> * Processors may provide user options for behaviors such as parsing the largest, or smallest, prefix of the input that is described by the grammar, or supporting invocation with input streams of indeterminate length.

<Steven> * Processors may provide a user option to suppress the ixml:state="ambiguous" attribute; they may also provide a user option to produce more than one parse tree in the case of ambiguity.

<Steven> * The form in which XML documents are produced is not constrained by this specification; processors should be capable of producing serialized XML as a character stream, but other forms (e.g. DOM instances or XDM instances) may also be used.

<Steven> * If the root node in the grammar is marked as an attribute, processors must ignore that marking whenever serializing the rule as the root.

SP: the item about prefix matching will be rewritten, but not in real time.

RESOLUTION: accept the proposal for issue #24 as revised (and with further revision of the item on prefix matching).

ACTION: SP to integrate resolution of #24 into the spec, and close #24.

Conformance issue: which grammars to accept and reject (et al.)

JL: for the case of non-XML characters in names, sometimes you can detect the problem statically, sometimes you can statically the *possibility* of such errors, and sometimes you don't know until you see the data.

SP: are we unhappy with the idea of dynamic errors?

MSM: I think we are happy with distinguishing dynamic and static errors,

but the current wording that requires conforming processors to reject non-conforming grammar needs to be updated.

JL, NDW: Several classes of errors:

(1) nongrammatical against spec grammar.

(2) grammatical but statically wrong (e.g. two rules for same nonterminal)

(3) statically correct but dynamic error (e.g. non-XML output)

msm (And the third item is why I will raise an issue about @root ...)

NDW: if we make multiple same-named attributes illegal, @root should be illegal, too

RESOLUTION: we will distinguish static and dynamic errors in grammars, and processors are allowed to detect dynamic errors statically.

JL volunteers to scribe next week.

Summary of action items

  1. MSM to make an ixml/grammars directory for grammars of published notations.
  2. SP to integrate resolution of #24 into the spec, and close #24.

Summary of resolutions

  1. accept the proposal for issue #24 as revised (and with further revision of the item on prefix matching).
  2. we will distinguish static and dynamic errors in grammars, and processors are allowed to detect dynamic errors statically.

Summary of issues

  1. Should marking the root symbol with an "@" be an error
Minutes manually created (not a transcript), formatted by scribe.perl version 185 (Thu Dec 2 18:51:55 2021 UTC).

Diagnostics

Succeeded: s/\me a test comment from john//

Succeeded: s/Consensus:/RESOLVED:/

Succeeded: s/you are allowed/processors are allowed/

Succeeded: s/accept the proposal as revised (and as to be further revised)/accept the proposal for issue #24 as revised (and with further revision of the item on prefix matching)/

Maybe present: BTW, JL, MSM, NDW, SP