Re: TICKET 259: 'treat as invalid' not defined

Thanks for making a counter-proposal.  A few notes below.

On Fri, Dec 10, 2010 at 1:59 AM, Julian Reschke <julian.reschke@gmx.de> wrote:
> On 12.11.2010 08:53, Julian Reschke wrote:
>> On 12.11.2010 05:58, Mark Nottingham wrote:
>>>
>>> I'm confused. I thought that we were going to talk about error
>>> handling in an appendix, but it appears you're starting to talk about
>>> it here.
>>
>> 1) Yes, it should be an appendix.
>>
>> 2) Well, it's parsing advice. It appears that some readers have trouble
>> understanding how to derive a parsing strategy from the way how we
>> currently write specs, so this is an attempt to describe just that.
>
> Here's an updated proposal (see also
> <http://trac.tools.ietf.org/wg/httpbis/trac/attachment/ticket/259/i259.diff>):
>
> -- snip --Appendix D.  Parsing
>
>   This document does not require any specific handling of invalid
>   header field values.  With this in mind, the text below describes a
>   simple strategy for parsing the header field and detecting problems
>   in general, or in specific parameters.
>
> D.1.  Combine Multiple Instances of Content-Disposition
>
>   If the HTTP message contains multiple instances of the Content-
>   Disposition header field, combine all field values into a single one
>   as specified in Section 4.2 of [RFC2616].
>
> D.2.  Parsing for Disposition Type and Parameters
>
>   Using the simplified grammar below:
>
>     field-value = disp-type *( ";" param )
>     disp-type   = token
>     param       = token "=" value
>
>   ...parse the field value into a disp-type (disposition type) and a
>   sequence of parameters (pairs of name (token) and value).  Lower-case
>   all disposition types and parameter names.
>
>   If the field value does not conform to the grammar (such as when not
>   exactly one disposition type is specified), ignore the whole header
>   field.

This doesn't cover cases like the following:

Content-Disposition: attachment; inline; filename=foo.exe

We want to treat those as an attachment.  Another grammer we could use
might be the following:

     field-value = item *( ";" item )
     item          = disp-type / param
     disp-type   = <OCTET, except ";" and "=">
     param       = param-name "=" param-value
     param-name = <OCTET, except "=">
     param-value = <OCTET, except ";">

We could then say that first disp-type and the first param are the
ones that matter.  (I'm not sure this grammar handles <"> correctly,
but I'm sure we can sort that out.)

> D.3.  Checking Cardinality Constraints
>
>   If the parameter sequence contains multiple instances of the same
>   parameter name, ignore the whole header field.

We'd prefer to use the first one rather than ignore the header field.

> D.4.  Post-Process Parameter Values
>
>   For each parameter, post-process the associated value part according
>   to the grammar:
>
>   o  According to Section 3.2.1 of [RFC5987] for parameters using the
>      RFC 5987 syntax (such as "filename*").  If this fails, just ignore
>      this parameter.
>
>   o  According to the grammar for quoted-string (Section 2.2 of
>      [RFC2616]) for values starting with a double quote character (").

Does this imply \-decoding?  We don't want to do \-decoding.

>   o  Verbatim otherwise.

We'd like to do %-decoding both for the quoted and unquoted cases.

>   Note that this step starts with an octet sequence obtained from the
>   HTTP message, and results in a sequence of Unicode characters.

Somewhere we want to say what character set we're using.

> D.5.  Extracting the Disposition Type
>
>   The parsing step (Appendix D.2) has returned the disposition type (to
>   be matched case-insensitively), which can be "attachment", "inline",
>   or an extension type.  If the type is unknown, treat it like
>   "attachment" (see Section 3.2).

What if there's no disposition type?

Content-Disposition: filename=foo.exe
Content-Disposition: foo=bar

If I remember correctly, we're supposed to treat the former as inline
and the later as attachment.

> D.6.  Determining the File Name
>
>   The parsing and post-processing steps resulted in a set of parameters
>   (name/value pairs).  The suggested file name is the value of the
>   "filename*" parameter (when present), otherwise the value of the
>   "filename" parameter.
>
>   If neither is given, the UA can determine a name based on the
>   associated URI; for instance based on the last path segment.
>
>   Otherwise, the UA ought to post-process the suggested filename
>   according following Section 3.3. [[anchor10: We could say here that
>   UAs may reject filenames for security reasons, such as those with a
>   path separator character.]]

I'll update the wiki shortly to respond to your previous feedback and
with information from this message.

Thanks,
Adam

Received on Saturday, 11 December 2010 19:43:53 UTC