Bug 20275 - [XT3TS] document-2004 to 2006
Summary: [XT3TS] document-2004 to 2006
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XSLT 3.0 Test Suite (show other bugs)
Version: Working drafts
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Abel Braaksma
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
Depends on:
Reported: 2012-12-06 16:35 UTC by Tim Mills
Modified: 2015-04-15 15:22 UTC (History)
2 users (show)

See Also:

Windows screenshot of UTF8 name of file (14.07 KB, image/jpeg)
2015-03-31 03:31 UTC, Abel Braaksma

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Mills 2012-12-06 16:35:08 UTC
I'm failing to find the requested document xgespr%C3%A4ch.xml.  I have the test suite checked out onto a Windows file system.  If I run the XQuery

fn:put(<foo />, 'xgespr%C3%A4ch.xml')

it creates a file with a name distinct from the one coming out of Mercurial.

-a---        29/11/2012     11:04         90 xgespräch.xml
-a---        06/12/2012     16:33         44 xgespräch.xml

The filename differs from that used in the old test suite.

I don't know whether Bugzilla will preserve the non-ASCII characters in the above...
Comment 1 Michael Kay 2012-12-06 16:59:09 UTC
I guess there's a lot that can go wrong here, what with differences in filesystems, transmission through Mercurial, etc. Need to think if there's a better way of doing it, e.g. having something in the environment that creates a file with given name and content.
Comment 2 Abel Braaksma 2013-11-23 16:22:40 UTC
There seems to be a fix out for Mercurial in the form of an extension that should be installed on the serverside. The particular files with non-ASCII characters will have to be submitted again to fix the issue.

This is not a Windows, or Linux issue per se. It's caused by the choice of the Mercurial developers. For instance, on Windows they use the non-Unicode file API functions, which results in this mess, on Linux there are other issues. On their Encoding information page, they state that the best solution is either to have all filenames in ASCII-only, or to make sure clients and servers run on the same system (windows server, then only windows clients, linux server, then only linux clients etc). See [2].

For us, that's all bad news. It may be that the fix works, but I doubt whether W3C can, or will install it. The plugin is not supported by Mercurial and is still in beta. 

My suggestion is that we add a feature to create a file, of which the contents can be inlined, which becomes part of the prerequisites of a test. From what I've read, that seems to be the only sure-fire way of fixing this.

[1] http://mercurial.selenic.com/wiki/FixUtf8Extension
[2] http://mercurial.selenic.com/wiki/EncodingStrategy
Comment 3 Abel Braaksma 2015-03-31 03:29:35 UTC
It seems that the repository server has been updated. I just tested adding the same file again with the original umlaut: xgespräch.xml. To test whether it was successful, I cloned a new copy from the remote server. The result is as in the screenshot (Windows, not sure if Mac is similarly successful).

I'm a little surprised that it is fixed now, as the bug reports related to this have been postponed multiple times.

Anyway, I reran the tests from a clean copy and they succeed, so I'm going to resolve this long-standing bug.
Comment 4 Abel Braaksma 2015-03-31 03:31:10 UTC
Created attachment 1590 [details]
Windows screenshot of UTF8 name of file

Screenshot added.
Comment 5 Michael Kay 2015-03-31 08:11:20 UTC
Unfortunately I don't think the fact that you're seeing OK results means that everyone else is. I still have a problem with these files. If I commit them from a Mac, anyone using them from Windows has problems, and vice versa.
Comment 6 Abel Braaksma 2015-04-03 02:54:14 UTC
> If I commit them from a Mac, anyone using them from Windows has 
> problems, and vice versa.

I think they should be good now, I sent you a private mail. If the file structure works currently for you *and* for me (test it by cloning the Hg repos again) then at least it works on Mac and Windows, and with a little bit of luck on Linux as well.

I don't think we should try to create a specific feature for an isolated case, as I suggested earlier. But please comment if you think we should.
Comment 7 Abel Braaksma 2015-04-09 13:27:56 UTC
After some private mail exchange with Michael Kay it appears that what works on Windows makes it totally unworkable on Mac, leaving the system in an uncommittable state (whether this is a bug in Hg/Mercurial or SmartGit is irrelevant).

Proposal: fix this by 

1. Removing the xgespräch.xml file ..
2. .. and related garbaged files from the repository. 
3. Leave xgesprach.xml in
4. Edit the tests with an instruction that:
> prior to running, you should create a copy of xgesprach.xml to xgespräch.xml
5. Add xsgespräch.xml and the garbled names to the ignore list

This way, Hg will no longer be messed up with these files and only once per clone of the repository you would need to create such a copy (without it popping up as new files to be added in your commit window). I think that is a doable workaround.
Comment 8 Abel Braaksma 2015-04-15 15:22:04 UTC
Fixed as suggested in previous comment.