Bugzilla – Bug 11482
WF2: Change accept="" to accept file extensions as well as MIME types (maybe based on whether they start with a period)
Last modified: 2012-04-18 23:23:00 UTC
Requiring MIME types like
order to allow a .docx file to be uploaded is far too technical for the
layman. On the other hand it is equally impractical for web developers to
keep an accurate mapping of extension to MIME type within our code because
those types change and new types are added with some frequency. As a matter
of practicality file extensions should be acceptable tokens for the "accept"
attribute in order to make it useful.
Posted from: 188.8.131.52
see http://www.w3.org/Bugs/Public/show_bug.cgi?id=11481 for a bit more detail... (marking that bug as a duplicate)
*** Bug 11481 has been marked as a duplicate of this bug. ***
MIMe types may be awkward, but at least they are documented. File extensions are, as far as I know, completely unregistered. There are some web sites with incomplete maps (NIST had one, and there is fileext) and a number of known conflicts (the same extension used for two meanings). I fear that this would not result in good interoperability.
I don't see what compat problems could come from this. Using a file extension in @accept would just let browsers filter the file upload dialogs by file extension, a functionality already present in major OSes and used by many programs to good effect.
Browsers don't need to care about what the extensions represent or what type of files the author is "really" talking about. They just need to know that the author wants files with a particular extension. That's easy information to get from the OS, and useful for the user.
I think before extending this further we should first have solid implementations of accept="" as it stands today. The use case for "image/*" for instance is really big but there is not many user agents that do something clever with it.
(This is also the reason file extensions are not that nice as you cannot do something clever with them. You can just filter. If you have the media type you could offer a shortcut to Word for instance. If you just have the file extension that becomes a bit more dodgy.)
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
Change Description: no spec change
Rationale: I'm rejecting this mostly on the grounds Anne laid out in comment 5: accept="" isn't even remotely well implemented as it is, so adding yet more features to it already seems a bit dodgy. In practice I expect people will just pick the file regardless of the extension or type, anyway, so the feature is arguable not that useful in the first place.
mass-move component to LC1
Having implemented support for the accept attribute in IE10 we are now getting feedback from web developers that they would like to be able to be able to supply file extensions in the accept attribute. We believe this would be a useful addition to the accept attribute because in many operating systems including Windows, files can
Another problem is that file extensions are not platform-independent. In particular, the web platform does not really have knowledge of them (except in one or two places for legacy reasons).
Also, comment 5 is not addressed yet.
(In reply to comment #9)
> Another problem is that file extensions are not platform-independent.
Now that Mac OS Classic is long gone, file name extensions are used on pretty much all platforms that expose files to the user. Also, collision avoidance in this space has worked remarkably well without IANA formalities.
> In particular, the web platform does not really have knowledge of them
That could be treated as a bug.
In reply to comment #5
(In reply to comment #11)
> Here is one specific example. A web developer who wants the user to upload a
> .csv file. The mime type for csv files is �application/vnd.ms-excel�.
Actually it's text/csv. See: http://tools.ietf.org/html/rfc4180
(In reply to comment #12)
text/csv isn't a registered mime type on either of my two machines. Putting accept="text/csv" results in a *.* filter in both IE and Chrome. It's only a single data point but I think it still shows that using mime type in the accept parameter is inexact. Browsers can attempt to lookup the file extension the developer wanted, but it won't work in all cases.
If the developer could specify exactly what file extensions they wanted, we could eliminate the inexactness of the mime lookup.
File extensions aren't any more exact than MIME types, IMHO. What's the format of a .doc file? There's dozens of software packages that use that extension. Even Word uses that same extension for multiple different formats.
But I've never objected to this proposal per se; the reason this bug is not yet fixed is described in comment 5 and comment 6. The feature hasn't been widely implemented in the first place, it doesn't make sense to extend it already, when it's not clear it's even a valid feature (we don't have significant implementation experience with the MIME type version of this). CSV files are in fact a good example, regardless of whether the author uses text/csv or .csv: if a user goes to a Web site and downloads a CSV file, it has as good a chance of having the .txt or even .html extension, or indeed .php or .cgi, as it does .csv. Why would it help the user to set a type or an extension?
(In reply to comment #14)
> File extensions aren't any more exact than MIME types, IMHO. What's the format
> of a .doc file? There's dozens of software packages that use that extension.
> Even Word uses that same extension for multiple different formats.
Even if you can find an example of non-Word files being named .doc, it doesn't make MIME types and and file name extensions equally (in)exact. Chances are that for upload, a filter based on file name extension would still have a better success rate than MIME types, because the current systems in use use file name extensions, so a browser-provided always-incomplete mapping layer isn't needed.
> Awaiting implementation experience of existing accept="" feature
Why doesn't comment 11 count as sufficient experience? Why should anyone implement the feature as specced if they already see the specced feature doesn't do what's actually needed?
Henri: Based on comment 11, I would be forced to conclude that providing a MIME type at all is a waste of time, and hat we should remove that feature and replace it with one that takes extensions, to see if that works instead. That seems highly unlikely to be the actual situation (I would guess that since operating systems these days actually have applications register MIME types and extensions alike, they would often both work, especially for more standard types like images), hence my desire to get more implementation experience.
My guess is that data will show that both MIME types and extensions have their place, but both should be purely advisory, never _preventing_ users from uploading their files.
I am nominating this bug for escalation. The proposal is pretty small and fits well with the intention of the accept attribute, also the main objection seems to be lack of implementation experience, something that I believe is pretty debatable.
Proposed title: Accept attribute should allow file extensions in addition to the current allowed values
Description: The spec states: “The accept attribute may be specified to provide user agents with a hint of what file types will be accepted.” The options currently allowed in the accept attribute are mime types and the special tokens “audio/*”, “image/*”, “video/*”. Allowing file extensions to be a token would help many developers and fits in exactly with the purpose of the accept attribute.
(In reply to comment #18)
> I am nominating this bug for escalation. The proposal is pretty small and fits
> well with the intention of the accept attribute, also the main objection seems
> to be lack of implementation experience, something that I believe is pretty
This bug is in state "NEW"; thus, it's not supposed to be escalated to a tracker issue. Asking chairs for advice how to proceed...
(In reply to comment #19)
> (In reply to comment #18)
> > I am nominating this bug for escalation. The proposal is pretty small and fits
> > well with the intention of the accept attribute, also the main objection seems
> > to be lack of implementation experience, something that I believe is pretty
> > debatable.
> > ...
> This bug is in state "NEW"; thus, it's not supposed to be escalated to a
> tracker issue. Asking chairs for advice how to proceed...
In particular, look at the entries for Aug 3, 2011 and December 31, 2011, and specifically, the consequences for missing that latter date.
Also look at the entry for Jan 14, 2012.
(In reply to comment #20)
> Julian: see:
> In particular, look at the entries for Aug 3, 2011 and December 31, 2011, and
> specifically, the consequences for missing that latter date.
Dec 31: "Consequences of missing this date: bugs still open past this date can be escalated to Tracker Issues immediately if the originator so chooses."
So I read this that this issue can be escalated even through it's in OPEN state.
> Also look at the entry for Jan 14, 2012.
Aware of that :-).
Raised as <https://www.w3.org/html/wg/tracker/issues/197>.
Changed status per: http://lists.w3.org/Archives/Public/public-html/2012Jan/0087.html
(In reply to comment #11)
> In reply to comment #5
Original content of comment #11:
Comment 11 got lost somehow. No idea how, but I have asked the systems team to check on it.
Here for the record is the original comment:
--- Comment #11 from Sharon [MSFT] <firstname.lastname@example.org> 2011-10-14 22:51:38 UTC ---
In reply to comment #5 – I agree that there are some interesting use cases for
image/*, video/* and audio/* strings. I’m not proposing removing those,
however, we believe there are also good use cases for file extensions which are
not covered by the existing options.
Here is one specific example. A web developer who wants the user to upload a
.csv file. The mime type for csv files is “application/vnd.ms-excel”. If you
use this mime type in Chrome today you get only .xls file and in IE10 today you
get .csv, .slk, .xla, .xld, .xlk, .xll, .xlm, .xls, .xlt and .xlw. Neither of
these are really what the developer wanted.
In reply to comment #9 – Section 12 of the spec already has some mentions of
file extensions. Also the purpose of the accept attribute is to give the
browser hints for how to request files from the OS. This seems like a logical
place to use an OS concept even if it’s not known to the web platform.
Done, though I used slightly different text since the text in the CP didn't make much sense (e.g. it introduced a conformance criteria that depended on the concept of a "valid file extension", without saying what that was; required user agents to treat file extensions "as file extensions", without saying what that meant; and require user agents to treat an invalid string "as a MIME type", without saying how to do so), and added a few notes and examples for clarity.
Checked in as WHATWG revision r7057.
Check-in comment: Introduce extensions in accept=''.