Re: Percent-encoding in DM still broken?

* Richard Cyganiak <richard@cyganiak.de> [2012-01-31 15:56+0000]
> I just reviewed the responses to Last Call comments. It appears that Souri's comment here hasn't been properly handled:
> http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Nov/0000.html
> http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Nov/0002.html
> 
> Or am I looking at the wrong document? I'm looking at this:
> http://www.w3.org/2001/sw/rdb2rdf/directMapping/LC/Overview.html

That's the correct document. These mods should give us resolve the issues listed in <http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Nov/0002> without making all non-ASCII impossible to read:

- * Replace each PERCENT SIGN character ('%', U+0025) with the string "%25".
+ * Replace each character in the range U+0000 to U+0019 with the percent-encoded form of that character [RFC3986].
+ * Replace each of these characters
 PLUS SIGN character    ('+', U+002B)
 PERCENT SIGN character ('%', U+0025)
        LESS-THAN SIGN         ('<', U+003C)
        GREATER-THAN SIGN      ('>', U+003E)
        QUOTATION MARK         ('"', U+0022)
        LEFT CURLY BRACKET     ('{', U+007B)
        RIGHT CURLY BRACKET    ('}', U+007E)
        VERTICAL LINE          ('|', U+007D)
        CIRCUMFLEX ACCENT      ('^', U+005E)
        GRAVE ACCENT           ('`', U+0060)
        REVERSE SOLIDUS        ('\', U+005C)
    with the percent-encoded form of that character [RFC3986].
  * For table names, replace each NUMBER SIGN character ('#', U+0023) with the string "%23".
  * For table names, replace each SOLIDUS character ('/', U+002f) with the string "%2f".
  * For attribute names, replace each HYPHEN-MINUS character ('-', U+003d) with the string "%3D".
  * For attribute values, replace each FULL STOP character ('.', U+002e) with the string "%2E".
  * Replace each SPACE character (U+0020) with the PLUS SIGN character (+, U+002B).


> We discussed this on a call here:
> http://www.w3.org/2011/11/08-RDB2RDF-minutes.html
> but the minutes don't capture a clear resolution to the issue. At any rate it's still broken and needs to be fixed.

We had test cases which exemplify the decision around a minimal encoding:

http://www.w3.org/2001/sw/rdb2rdf/wiki/R2RML_Test_Cases_v1#I18NnoSpecialChars

> The simplest fix might be to replace the definition of “percent-encode” in the DM spec with a reference to “IRI-safe” in R2RML:
> http://www.w3.org/2001/sw/rdb2rdf/r2rml/#dfn-iri-safe
> 
> Best,
> Richard

-- 
-ericP

Received on Tuesday, 31 January 2012 16:57:08 UTC