6864 – MIME type for built-in datatypes

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 6864 - MIME type for built-in datatypes

Summary: MIME type for built-in datatypes

Status:	CLOSED WONTFIX

Alias:	None

Product:	XML Schema
Classification:	Unclassified
Component:	Datatypes: XSD Part 2 (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	P2 enhancement
Target Milestone:	---
Assignee:	David Ezell
QA Contact:	XML Schema comments list

URL:	http://soundadvice.id.au/blog/2009/05...
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2009-05-06 03:35 UTC by Benjamin Carlyle
Modified:	2010-11-10 17:04 UTC (History)
CC List:	2 users (show)

See Also:

Attachments

Description Benjamin Carlyle 2009-05-06 03:35:02 UTC

I have been using XSD built-in data types as representations for simple resources for quite some time. I have traditionally used the text/plain media type for this, however doing so fails to adequately describe the format of typed data. A simple use case follows:

A client samples http://dod.example.com/defcon, retrieving "4". The defense readiness condition is at four. The client puts a value of "3", attempting to raise the condition... but this is either not implemented or requires authentication and authorization checks.

I am eager to be able to use a standard mime type that allows me to convey information in this way, and XML Schema is an excellent catalogue of these basic types. You already address namespace considerations in addressing these data types as URLs, so is there any possibility that you could register an additional MIME type for non-XML use of these types?

A number of alternative identifiers come to mind:
* text/xsd+plain
* text/xsd-builtin
* application/built-in
etc

It would greatly simplify my life if I could refer to a MIME type that is well established as being consistent with your built-in types.

Thankyou.

Comment 1 Michael Kay 2009-05-06 09:29:25 UTC

Personal response: while I can see the logic, this seems to me to be stretching the concept of MIME type well beyond what it is capable of coping with. If we are to have MIME types for XSD-defined primitive types, then why not for user-defined types? And if we handle simple types, doesn't it make sense to handle complex types too? But this surely demands that types are identified by URIs rather than overloading the MIME registry. Apart from anything else, the processes used to register MIME types don't seem to be geared up to handle the kind of volume this would generate.

It seems eminently logical that there should be some kind of relationship between types as schema components and MIME types, but this proposal seems to nibble at the edges of what needs to be done to make that happen.

Comment 2 C. M. Sperberg-McQueen 2009-05-08 15:37:46 UTC

[Personal comment]

I'm not sure I understand what you're after.  You're returning "4", and not 
getting it typed.  Do you want a MIME type that means "an xsd integer"?
or one that means "a literal for a value of one of the XSD built-in types"?
(In which case, how do you tell which?  "0004" is a legal literal for integer,
int, short, decimal, and gYear.)

Without deploying any new machinery, would it not be feasible to return 

  <readiness>4</readiness>

or even

  <readiness xsi:type="xsd:integer" xmlns:xsi="...">4</readiness>

?  All the machinery of WSDL and web services is then at your disposal
for defining the type.

Comment 3 Benjamin Carlyle 2009-05-12 12:12:20 UTC

My use case doesn't involve WSDL or WS-* technologies at this stage, which is why a mime type identifier is particularly useful. I work in the SCADA industry where a great deal of the information we have to deal with is atomic data, or atomic data with a few timestamps and other metadata attached. This information is often transferred or updated in soft real-time and generally has a safety-related aspect to it.
The main protocol I am using for interoperability between systems is HTTP, which relies on a correct mime type both for content negotiation purposes (I support a bare value, but will also accept a structured XML document with a number and corresponding timestamp) and for correct parsing. In short, it's a matter of being self-describing.
I have used text/plain as a proxy mime type for this purpose within my own organisation alongside an internal vcalendar mime type where a known set of metadata is used. I would like to have a more stable identifier available both as something to make use of within my organisation and to recommend to others who need to communicate very simple information such as numbers and strings, the kinds of information that the built-in types codify well.
I would be inclined to have a single mime type identifier that said "one of those xsd types". That is certainly the approach that we have taken for several years in my organisation. Which one of those types is not important to my application, and a bit of fuzz in this area is actually helpful to me. A server may only support the "short" range, but a client that requests the data expecting a long will parse the response value of "4" correctly. This allows a single client with a sufficiently large range for its purpose to interact with a range of different servers or resources with different ranges. The "text/plain" == "some built-in type" rule has worked out to be sufficiently self-describing for correct parsing and content negotiation but fuzzy enough to allow different versions of software or independently-developed software to exchange information effectively.
The exact purpose of the information can vary also. A major internal use case for me is to get data onto a HMI shared between multiple services. This requires a common data type, and again much of this data is primitive in nature. Encoding "readiness" into the document for example would be something I would see as counter productive in brining the information to as wide an audience as possible. I really want everything from reports programs to spreadsheets, to obviously other services to be able to pick this data up and make use of it without a heavy relience on the notation that it is specifically a defence readiness condition. To most of these clients it is just a number they need to manipulate and output to a human or to another machine.
On the subject of making recommendations to others on how to deal with exchanging primitive data via HTTP: I have some small regard in the REST community, and am currently involved with writing a book for Prentice Hall - "SOA with REST". This has been a motivator for me to I guess push some of the internal best practices that have been developed within my organisation over the wall to a wider standards audience.
To me this is an important building block of machine interoperability, and attaching a mime type to it would be a useful step towards being able to exchange these basic types easily... especially directly over HTTP.

Comment 4 C. M. Sperberg-McQueen 2009-06-03 02:43:44 UTC

[Speaking for myself]

I've just spent some time reviewing this issue and the original poster's clear and helpful description of his usage of text/plain at

  http://soundadvice.id.au/blog/2009/05/05/#textplain

The end result is, I think, a combination of skepticism about the technical merits of the proposal and uncertainty about who would be the appropriate body to undertake the task if the task is to be undertaken.

The XML Schema WG is chartered (http://www.w3.org/XML/2009/02/schema-charter.html) to finish XSD 1.1 and to maintain XSD 1.0 and 1.1.  I think we do have an explicit goal of making XSD Datatypes usable not only in Structures but in other contexts (though I'm not going to cite a source for that belief unless someone argues otherwise), but I don't believe creating other specs to use XSD Datataypes is part of the WG's program of work.  That is, I think the proposal here is strictly speaking out of scope for the WG.  I am not sure what person or body has particular responsibility for the health of  the text/* branch of the MIME type hierarchy, but I rather think that they, not the XML Schema WG, are the responsible party here.

I admit I might feel differently about the scope issue if I were more enamored of the proposal.  In that case, I might argue that defining a MIME type with the proposed semantics would fall under the heading of encouraging adoption of the XSD spec.  So I should probably outline my reasons for skepticism.

As comment 3 clarifies, what is proposed here is a MIME type with the meaning "the body of this message is a literal in some built-in XSD datatype or other".  This strikes me as unhelpfully vague and underspecified, whether the intended semantics are that the literal belongs to (a) an unspecified XSD datatype, whether built-in or user-defined, or (b) an unspecified built-in XSD datatype, or (c) one of the XSD primitive (or special?) datatypes.  And as the blog post pointed to above mentions, the limitation to XSD datatypes can be sub-optimal for some expected deployment scenarios, in which other non-XSD datatypes may also appear (e.g. geolocation information, or measurements for which the unit of measure is also to be specified). 

The alternative of a suite of MIME types for (let us say) the built-in primitive datatypes would provide more reliable information about the value actually being transmitted, but would I gather be more specific and constraining than the original poster wants.  (I also don't relish the task of getting approval for a suite of twenty new MIME types, but that may be just a personal preference to avoid doing any hard lifting.)

It's clear from the blog post cited above that part of the appeal of using text/plain is that text/plain offers extremely low barriers to entry and usage:  it's really easy to emit the data, and the data are really easy to use, as long as you know a priori whether it's going to be an integer (as decimal or as float or as double) or a gYear or a string or a relative URI reference.  Even a very simple XML encoding of the information as 

  <datum xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns:xsd="http://www.w3.org/2001/XMLSchema"
               xsi:type="xsd:anySimpleType">1223</datum>

is significantly more complicated for the receiver to parse without an XML parser.  (Not really very hard, especially if you know the element is going to be called 'datum' and that the only attributes to expect are the namespace declarations and the type information, but still intimidating to someone who thinks text/plain might well work just fine.)  It seems clear that my earlier suggestion that Web Services might be brought to bear was calculated to horrify anyone who finds the MIME-type proposal attractive; sorry about that.

But on reflection, I think the answer is:  the relative complexity of XML is there for a reason, and the XML example just given includes a lot more information than a message consisting only of 

  1223 

and it adjusts much more readily to change.  Someone wants to transmit a geolocation encoded using a type defined with DTLL?  Declare appropriate namespaces, provide appropriate pointers to something to tell you what the thing is.  If an application really doesn't need the ability to identify a specific datatype, because all it really needs is the literal, then the application seems necessarily to have some other channel of information to provide the necessary background information to enable "1223" to suffice as a message.  But in that case, I am not quite sure what a MIME type of the kind suggested here would provide by way of benefit:  it doesn't tell me anything about where the value came from, or even what kind of thing it is, with any precision (unless the literal is one of the rare literals which would be legal as a string but does not appear in any other lexical space).  If I have enough background information to know whether "1223" denotes a gYear, a relative URI, an integer, a string, or a time of day written using some non-ISO-8601 convention, or if my application really doesn't need to know any of that, then I'm not sure why I or my application need to know that the "1223" in the message is a literal in any of the XSD lexical spaces.  If they need to know that, then my strong gut feeling is that they probably need to know a lot more.

The upshot, I reget to say, is that I am not sold on this as an enhancement to XSD 1.1 or as an activity for the XML Schema WG.  Others may of course have different views.

Comment 5 David Ezell 2009-06-15 16:19:23 UTC

On the telcon of 2009-06-13 the WG discussed this issue, and decided that the rationale given in bug 6864 comment 4 provides the correct rationale for closing this bug without further action at this time.  Specifically, it's not clear that "typing" individual data within XML documents using MIME types is the correct fit, and that providing any solution in this space prematurely would be counter productive.

Therefore, the WG decided to close this bug as WONTFIX since at this point there is no clear way forward.

Comment 6 David Ezell 2010-11-10 17:01:06 UTC

The WG reported this bug as WONTFIX on 2009-06-15.  We are closing this bug as requiring no futher work.  If there are issues remaining, you can reopen this bug and enter a comment to indicate the problem.  Thanks very much for the feedback.

Comment 7 David Ezell 2010-11-10 17:02:58 UTC

The WG reported this bug as WONTFIX on 2009-06-15.  We are closing this bug as requiring no futher work.  If there are issues remaining, you can reopen this bug and enter a comment to indicate the problem.  Thanks very much for the feedback.

Comment 8 David Ezell 2010-11-10 17:04:05 UTC

The WG reported this bug as WONTFIX on 2009-06-15.  We are closing this bug as
requiring no futher work.  If there are issues remaining, you can reopen this
bug and enter a comment to indicate the problem.  Thanks very much for the
feedback.

Sorry for the duplicate comments here.