<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>24769</bug_id>
          
          <creation_ts>2014-02-21 19:40:00 +0000</creation_ts>
          <short_desc>SVG Path BNF is ambigious</short_desc>
          <delta_ts>2014-02-24 18:21:54 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>SVG</product>
          <component>Paths</component>
          <version>SVG 2.0</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>Test Suite</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Joe Gregorio">jcgregorio</reporter>
          <assigned_to name="Doug Schepers">schepers</assigned_to>
          <cc>ian</cc>
          
          <qa_contact name="SVG Public List">www-svg</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>101139</commentid>
    <comment_count>0</comment_count>
    <who name="Joe Gregorio">jcgregorio</who>
    <bug_when>2014-02-21 19:40:00 +0000</bug_when>
    <thetext>http://www.w3.org/TR/SVG2/single-page.html#paths-PathDataBNF

There is no description in the text on how this BNF is to be used. For example, are the rules for the BNF alternatives to be parsed as &apos;first match wins&apos;?

If so, then the following rules won&apos;t produce good results, as the may match part of a floating point number as an integer before matching the floating-point-constant rule:

nonnegative-number:
    integer-constant
    | floating-point-constant
number:
    sign? integer-constant
    | sign? floating-point-constant

If the BNF is intended to be &apos;first match wins&apos; then these rules should be written as:

nonnegative-number:
    floating-point-constant
    | integer-constant
number:
    sign? floating-point-constant
    | sign? integer-constant

This is also true of SVG 1.1 and 1.2 Tiny.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>101246</commentid>
    <comment_count>1</comment_count>
    <who name="Joe Gregorio">jcgregorio</who>
    <bug_when>2014-02-24 18:21:54 +0000</bug_when>
    <thetext>OK, so I think part of the problem here is my use of the ambiguous terms
first-match vs longest-match. What we are really seeing here is a difference
between a PEG (parsing expression grammar) and a CFG (context free grammar).

http://en.wikipedia.org/wiki/Parsing_expression_grammar
http://en.wikipedia.org/wiki/Context-free_grammar
http://en.wikipedia.org/wiki/Comparison_of_parser_generators

The BNF supplied is a valid CFG, parsers built for CFGs do backtracking, which
is one way to handle the issue around the ambiguity in the &quot;nonnegative-number&quot; rule. The other way to handle it is to rearrange the grammar so that it is non-ambiguous.

From the page http://en.wikipedia.org/wiki/Parsing_expression_grammar:

&quot;&quot;&quot;Syntactically, PEGs also look similar to context-free grammars (CFGs), 
 but they have a different interpretation: the choice operator selects 
 the first match in PEG, while it is ambiguous in CFG.&quot;&quot;&quot;

This isn&apos;t a theoretical problem, I am building a parser for SVG paths and as a start I copied the supplied BNF into a parser generator, unfortunately the parser generator I am using is a PEG and not a CFG parser generator:

  https://github.com/dmajda/pegjs#expression1--expression2----expressionn

I am apparently not the only one to get tripped up by this, as this parser
also has the same problem:

  https://www.npmjs.org/package/svg-path-parser

If you change the following line in their example:

    , &apos;A25,35 -80 1,1 450,220 Z&apos;

to:

    , &apos;A25.0,35 -80 1,1 450,220 Z&apos;

then that parser will crash on what should be a valid path, since the 25 gets 
parsed as an&quot;integer-constant&quot; and leaves .0 on the input, which fails to parse as anything in the grammar.

In this case I think the best solution is to rearrange the grammar slightly as suggested, as it makes it valid for both PEG and CFG. The alternative would be to add a sentence to the spec that the BNF
is a CFG and not to be used in a PEG.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>