Path Normalization Causing Issues

If you try and normalize the following two IRIs:

".."
"http://example.com/foobar/"

You end up with:

""
"http://example.com/foobar/"

Then resolve the former as relative to the latter:

"http://example.com/foobar/"

This this is per section 5.3.2.4. of RFC3987:
> The complete path segments "." and ".." are intended only for use  
> within relative references (section 4.1 of [RFC3986]) and are  
> removed as part of the reference resolution process (section 5.2 of  
> [RFC3986]). However, some implementations may incorrectly assume  
> that reference resolution is not necessary when the reference is  
> already an IRI, and thus fail to remove dot-segments when they occur  
> in non-relative paths. IRI normalizers should remove dot-segments by  
> applying the remove_dot_segments algorithm to the path, as described  
> in section 5.2.4 of [RFC3986].

As ".." is an IRI, it can be normalized, which results in "". This is  
obviously problematic. Should path segment normalization only be done  
when there is a scheme and/or authority?


--
Geoffrey Sneddon
<http://gsnedders.com/>

Received on Tuesday, 9 September 2008 17:03:41 UTC