W3C

– DRAFT –
Invisible XML

20 February 2024

Attendees

Present
Bethan, John, Michael, Norm, Steven
Regrets
-
Chair
Steven
Scribe
norm

Meeting minutes

Date: 2024-02-20

Review of agenda

ACTION 2023-01-10-f continues

ACTION 2023-10-17-a completed

ACTION 2023-11-28-a continues

ACTION 2023-11-28-c continues

ACTION 2023-11-28-e continues

ACTION 2024-01-09-a completed

Status reports

John: I've updated the workbench to fix bugs. Names weren't being recognized unless they were followed by a space.

John: Working on experiments to match strings and regular expression matching. Tidied up the UX a bit.

Norm: No status yet.

Bethan: The PhD is submitted!

Bethan: Have also started working on my implementation again!

Michael: Nothing to report.

Steven: I've staged the next version, but haven't pushed the changes. Time has been focused on paper for Prague.

Steven: Paper is about round-tripping iXML.

John: I'm also doing some experiments on round-tripping. Can you create a stylesheet from the grammar such that if you run it on the output, it will provide a flattened result.

Bethan: My instinct is that the grammar is to some extent a schema for the output XML (or embodies the same information) as at least the part of the schema for the output your producing

John: Things become very complicated where operators are added back into the right places.

Steven: Someone else is submitting an iXML paper at Prague; he wants to use iXML to reparse XML, to extract information from the text nodes and put it back into the XML.

John: This is something like the work I've done parsing XPath out of XSLT.

Some additional discussion of the problems of round tripping.

Bethan: Could you leverage a schema to produce a grammar to parse some texts into the grammar?

Nods of agreement: there's something interesting about the intersection between grammars and schemas.

publication of ixml spec as W3C CG Report

Steven: This is finished, but there are some URL problems.

ACTION: Steven to contact W3C to get the report links fixed on the report and ixml group pages.

Issue #139 Sample grammars for IRIs and URIs

Steven: I published something today; we can discuss it next week.

Issue #202 Spec should say Unicode version is implementation-defined

ACTION: Steven to amend the specification to describe how Unicode is version-dependent

Issue #199 Require whitespace between prolog and first rule?

Norm: Whitespace is required between rules but not between the prolog and the first rule.

Norm: I think we should be consistent.

Steven: The space is needed between rules to avoid ambiguity; this is a change for the sake of change.

John: If we start to put multiple things in the prolog, things may get ambiguous.

Norm: I'm not going to lie down in the road if we leave this until we need to do it.

Bethan: I think given it's backwards incompatible, we should do it sooner rather than later.

Steven: We also have a larger prolog issue that I raised in email.

ACTION: Michael to make sure that Steven's prolog issues are turned into trackable issues.

Issue #192 Normalizing line endings in ixml inputs

Steven: This is a request from the broader community for a way to specify a line ending that's not platform-dependent.

Norm: That's the issue with a particular spin, I think the user would like us to just normalize to #A and move on.

Michael: I don't know on an IBM mainframe that uses variable length records what they do. I suppose the obvious thing to do is to say that a record boundary turns into a #A.

Some discussion of what IBM mainframes actually do for storing text files.

Michael: I think the proposal is that the iXML spec should say that an implementation presents end-of-lines as #A regardless of the platform.

Steven: My problem with that solution is that if I get files over the web, I don't know where they came from.

Norm: I think XML solves this; there's a simple algorithm for deciding if and which sequences of characters are turned into a single #A

Michael: I think it boils down to: when you're reading a character stream, you normalize line feeds. You have some built in understanding understanding of line boundaries and you recognize them.

Michael: The question in my mind is, if you wanted to use iXML to do something a little closer to the metal, how would you do it?

Norm: I think if you want to do that, you want to treat the input of some kind of binary so it's out-of-scope

Some discussion of the circumstances when you might want to process "binary" of one sort of another.

Bethan: Why not introduce an end-of-line marker for a non-platform-specific end-of-line?

Steven: That's what I proposed in response to Norm.

Bethan: My suggestion is that the character would be a shortcut for the expansion \n, \n\r\, \r, etc.

John: Would you be able to do that in a member string?

Steven: You can't negate that easily, that was part of my example.

Michael: I may be misunderstanding, but I'm not a big fan of the idea of what Bethan suggests; but I'm not sure the problem Steven identifies is a real one.

Michael: Suppose we choose a single character for abstract "end of line"; NEL. If we said NEL means any linefeed, so when that's in grammar...

Steven: I think what Bethan proposed is a character in the grammar that represents this.

Michael: I can say, give me anything that's not that character.

Steven: But it's not only one character.

Michael: If we do normalize, I think normalizing the same way as XML would be wiser, therefore #A.

Any other business?

John: I think we should try to keep track of where people are using iXML. I got a bug report from someone using iXML to parse some genetic data.

Adjourned.

<cmsmcq> FWIW, US daylight time starts 10 March. (Yikes.)

<cmsmcq> In the UK, it starts 31 March.

Summary of action items

  1. Steven to contact W3C to get the report links fixed on the report and ixml group pages.
  2. Steven to amend the specification to describe how Unicode is version-dependent
  3. Michael to make sure that Steven's prolog issues are turned into trackable issues.
Minutes manually created (not a transcript), formatted by scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).

Diagnostics

Succeeded: s/for output that your're output/for the output your producing/

Succeeded: s/on the agenda for next time/turned into trackable issues/

Succeeded: s/tihs/this/

Maybe present: Date

All speakers: Bethan, Date, John, Michael, Norm, Steven

Active on IRC: cmsmcq, norm