file-system-api: filename restrictions

Section 8 "Uniformity of interface" will cause headaches for some use
cases.  For example, an application may want to allow the user to fill
a directory with images, then output a thumbnail of each image "x.jpg"
into a subdirectory with the same filename, "thumbs/x.jpg".

However, we're forbidden from creating a new file with "invalid"
filenames, even if they exist elsewhere.  The operation will fail, and
we'll have to tell our Linux users with images named "at the beach:
moon rock?.jpg" that they have to obey Windows filename
conventions--which will probably be upsetting.  It'd also be a
difficult rule for users to follow; while it's easy in Windows since
it's globally enforced in all applications, Linux users would have to
memorize the rules themselves.

It's also a pain for backing up files, eg. copying "moon rock?.jpg" to
"moon rock?.jpg~", and for "safe writes", writing to "moon
rock?.jpg.new" and then renaming the finished file over the original.

These seem like bigger problems than the one it's trying to solve.  Is
it really insufficient for these rules to define what filenames must
be supported, that any others may not be, and to suggest a UA log if
nonportable filenames are created?  (Of all filename issues, the only
one that I've ever found to be a serious real-world portability issue
is case-insensitivity.)

I guess there are other issues with reading data created outside of the API:

- filenames that can't be decoded to a DOMString, eg. undecodable
bytes in a UTF-8 filesystem.  This is common in Linux after eg.
unzipping a ZIP containing SJIS filenames.  Should these simply be
ignored with a log?
- existing filenames that differ only by case.  Similarly, should the
UA just ignore all but one of them and make a log to the console?

Should "whitespace" in section 8.3 simply indicate space, U+0020?
Windows does allow creating filenames ending with NBSP and other
Unicode whitespace characters, and it's not clear whether this should
be allowed.  Other whitespace (\r, \n, \t) is covered by the control
character rule.

Sorry if this is a rehash of past topics.

-- 
Glenn Maynard

Received on Sunday, 19 December 2010 21:25:47 UTC