Invisible XML Community Group

Meeting minutes

Accept the minutes of the previous meeting

Accepted.

Review of open actions

No progress.

Status reports

Norm: I got NineML 3.2.9, includes an attempt to provide better diagnostics on failed parses.

John: I've done something similar with my parser, it shows you the open rules.
… The real problem is when you have ambiguity as well.

Bethan: Even when you have ambiguity, the failure comes at a particular point in the parse. You can't be parsing more than one thing.

John: There are two cases, one is a failure in the middle of the input. There's nothing else that can be predicted. That might be under a number of different branches.
… The other one is where you got to the end of the input and there's nothing left. What was supposed to come next?

Bethan: What I'm interested in working on are tools that will treat your grammar as a generator rather than a recognizer.
… So that you can for fairly small grammars generate a sample of what is recognized.
… So you can look at that and go "Uh, no, not that!"
… Sometimes it can be very helpful to walk through a regular expression, for example, and recount what could be matched.
… That process of talking through is a process of generating from the regular expression.

John: Can you do this from any grammar without ambiguity?

Some discussion of generators.

Bethan: I'm also interested in trying to work out what a grammar might be *trying* to match: phone numbers, UK post codes, etc.
… And then point people to versions of those grammars.

Fredrik: I've been thinking about rendering parts of the graph when a parse fails.

Bethan: I wonder if we can make something like Gunther's railroad diagrams; or maybe overlapping circles, some way to indicate what belongs to what and what is partial.

John: Starting to find users finding really weird errors which is nice.

Some discussion of the origins of the test case that demonstrated the problem of non-XML output (a single text node)

John: It's worth making people aware that record-oriented can produce output that isn't well formed.

Fredrik: With the current state of the parsers, it's very expensive to parse a huge file. It's more efficient to do it in a record-oriented way.

Bethan: One of the things I've been thinking about recently is that we have a bunch of dynamic errors that are things like there not being a single root node, or two attributes with the same name on the same element.
… Earlier the spec says that conformance is about processors and grammars, not the combination of a grammar and in input.
… That being the case, I think these should be static errors on the grammar; not errors that are only thrown if a particular input produces not-well formed XML.
… I'm convinced you can do it.

John: Take the case of duplicated attributes, it depends on the input if that's the case.
… If you said that it was a static error, that would be a problem.

Fredrik: Do you have an example?

Bethan: I think a single root element is relatively easy to work out; but suppressed elements make it a little harder.
… I think we should think about removing the statement from the spec or we should talk about changing those to static errors.

Fredrik: I sort of agree, but I think it might be easy to create a grammar that will create a root node or not. And that's entirely dependent on the input.

Bethan: I think the spec implies that if a grammar could produce not well-formed XML, the grammar isn't conformant.

John: Does that not imply that the universe of inputs is infinite?

Some further discussion of what this analysis would look like.

ACTION: Bethan to review the dynamic errors and see which ones might possibly be resolved statically.

Issue #294, “parse tree” is not defined in the specification

Norm: Let's wait for Steven.

Perspectives on serialization

Norm: Wait for Steven

Norm: Should I try to improve that though?

Thumbs up, consensus that it's worth making another pass to improve it.

Next meeting

18 March. Note that will be 1 hour later in the US. No regrets heard.

Any other business

None heard.

– DRAFT –
Invisible XML Community Group

4 March 2025

Attendees