This specification defines an HTTP client interface for XPath based languages. The HTTP client interface is provided through a single extension function which performs HTTP requests, and associated error codes which define client error states.
It has been designed to be compatible via [[!XPATH20]] with [[!XQUERY]], and [[!XSLT20]]. It should also be suitable for any other language which hosts XPath 2.0, such as [[!XPROC]].
The module defined by this document defines one function in the namespace
http://expath.org/ns/http-client. In this document, the
http prefix, when used, is bound to this namespace URI.
Error codes are defined in the namespace
http://expath.org/ns/error. In this
err prefix, when used, is bound to this namespace URI.
Error conditions are identified by a code (a
QName). When such an error
condition is reached during the execution of the function, a dynamic error is thrown, with
the corresponding error code (as if the standard XPath function
fn:error had been called).
There are many cases where the HTTP protocol layer may raise an error. In each case, if the error condition is not mentioned explicitly in the spec, the implementation MUST raise an error with the error code err:HC001.
This module defines an XPath extension function that sends an HTTP request and returns the corresponding response. It also supports HTTP multi-part messages. Here is the signature of this function:
$requestcontains the various parameters of the request, for instance the HTTP method to use or the HTTP headers. Among other things, it can also contain the other parameters' values: the URI and the body(s). If they are not set as parameters to the function, their value in
$request, if any, is used instead. See the for the definition of the http:request element. If the parameter does not follow the grammar defined in this spec, the error err:HC005 MUST be raised.
$hrefis the HTTP or HTTPS URI to send the request to. It is an
xs:anyURI, but is declared as an
xs:stringso that literal strings may be used; in other words, the parameter does not need to be explicitly cast as
$bodiesis the request body content, for HTTP methods that can contain a body in the request (e.g. POST). It is an error if this parameter is not the empty sequence for methods that must be empty (e.g. DELETE). The details of the methods are defined in their respective specifications (e.g. [[!rfc2616]] or [[!rfc4918]]). In case of a multipart request, it can be a sequence of several items, each one is the body of the corresponding body descriptor in
$request: see http:multipart.
Besides the arity-three signature above, there are two other signatures that are convenient shortcuts (corresponding to the full version in which corresponding parameters have been set to the empty sequence). They are:
The functions defined in this module allow the transmission of a request to an HTTP server and the reception of the corresponding response. The request is represented by the parameters to the function, which define how to generate the actual HTTP request to transmit.
http:request element represents all the information needed to send the
Some of the values defined for the http:request element can instead be set through
a parameter to the function. For instance, some signatures define the
$href. If the value of this parameter is not the empty sequence,
it will override the value of the attribute
<http:request method = ncname href? = uri http-version? = string status-only? = boolean username? = string password? = string auth-method? = string send-authorization? = boolean override-media-type? = string follow-redirect? = boolean timeout? = integer> <!-- Content: (http:header*, (http:body|http:multipart)?) --> </http:request>
methodis the HTTP method to use, e.g.:
POST, etc. It is case insensitive
hrefis the URI that the request is made to. It can be overridden by the parameter
http-versionis the version of HTTP to use. It must be either the string
1.1. Default is implementation-defined. An implementation SHOULD support both and the default SHOULD be
1.1. If the value specified is not supported by a specific implementation, it MUST throw the error err:HC007.
status-onlycontrols how the response will be parsed; if it is true, only the status code and the headers are returned, and the content is omitted (no http:body, nor http:multipart, nor the interpreted additional value in the returned sequence).
send-authorizationare used for authentication (see ).
override-media-typeis a Media Type ([[rfc6838]]). It can be used only with http:request, and will override the
Content-Typeheader in the HTTP Response returned by the server.
follow-redirectcontrols whether an HTTP redirect is automatically followed or not. If it is
false, the HTTP redirect is returned as the response. If it is
true(the default) the function tries to follow the redirect, by sending the same request to the new address (including body, headers, and authentication credentials). Maximum one redirect is followed (there is no attempt to follow a redirect in response to following a first redirect).
timeoutis the maximum number of seconds to wait for the server to respond. If this time duration is exceeded, the error err:HC006 MUST be raised.
http:headerrepresents an HTTP header, either in the http:request or in the http:response elements.
http:multipartrepresents a multi-part body, either in a request or a response.
http:bodyrepresents the body, either of a request or a response.
http:header element represents an HTTP header, either in a request or
<http:header name = string value = string> <!-- Content: empty --> </http:header>
http:body element represents the body of either an HTTP request or an
HTTP response (in multipart requests and responses, it represents the body of a single
<http:body media-type = string src? = uri method? = "xml" | "html" | "xhtml" | "text" | "binary" | qname-but-not-ncname byte-order-mark? = "yes" | "no" cdata-section-elements? = qnames doctype-public? = string doctype-system? = string encoding? = string escape-uri-attributes? = "yes" | "no" indent? = "yes" | "no" normalization-form? = "NFC" | "NFD" | "NFKC" | "NFKD" | "fully-normalized" | "none" | nmtoken omit-xml-declaration? = "yes" | "no" standalone? = "yes" | "no" | "omit" suppress-indentation? = qnames undeclare-prefixes? = "yes" | "no" version? = nmtoken> <!-- Content: any* --> </http:body>
media-type is the media type of the body part. It is mandatory. In
a request it is provided by the user and is the default value of the
Content-Type header if
it is not set explicitly. In a response, it is provided by the implementation from the
Content-Type header returned by the server. The
src attribute can be used in
a request to set the body content as the content of the linked resource instead of using
the children of the http:body element. When this attribute is used, only
media-type attribute must also be present, and there can be neither
content in the
http:body element, nor any other attribute, otherwise the
error err:HC004 MUST be raised.
All the attributes, except
src, are used to set the corresponding
serialization parameters defined in [[!xslt-xquery-serialization]]. Those attributes
can be provided by the user on a request to control the way a part body is serialized. In the
response, the implementation can, but is not required, to provide some of them if it has
the corresponding information (some of them do not make any sense in a response, therefore
they will never be supplied on the response element, for instance
http:multipart element represents an HTTP Multipart Type request or
<http:multipart media-type = string boundary? = string> <!-- (http:header*, http:body)+ --> </http:multipart>
media-type attribute is the media type of the whole request or response,
and has to be a multipart media type (that is, its main type must be
boundary attribute is the boundary marker used
to separate the several parts in the message (the value of the attribute is prefixed with
--" to form the actual boundary marker in the request; conversely,
this prefix is removed from the boundary marker in the response to set the value of the
If the request entity body has content (one body or several body parts), it can be specified by
the http:multipart element, the http:body element, and/or the
$bodies. For each body, the content of the HTTP body is generated
Except when its attribute
src is present, a http:request
element can have several attributes representing serialization parameters, as defined in
[[!xslt-xquery-serialization]]. This spec defines in addition the method
binary; in this case the body content must be either an
xs:hexBinary or an
xs:base64Binary item, and no other serialization parameter can be set
The default value of the serialization method depends on the
xml if it is an XML media type,
html if it is an HTML
xhtml if it is
text if it is a textual media type, and
binary for any other
When a body element has no content (i.e. no child nodes) its content
is given by the parameter
$bodies. In a single part request, this parameter must
have at most one item. If the body is empty, the parameter cannot be the empty sequence. In a
$bodies must have as many items as there are empty body
elements. If there are three empty body elements, the content of the first of them
$bodies, and so on. The number of empty body elements must be equal to
the number of items in
HTTP authentication when sending a request is controlled by the attributes
send-authorization on the http:request element.
username has a value,
auth-method must have a value too. And if any one of the three other
attributes have been set,
username must be set too.
auth-method can be either
Digest, but other values can also be used, in an implementation-defined
way. The handling of those attributes must be done in conformance with [[!rfc2617]].
true (default value is
false) and the authentication
method supports generating the header
Authorization without challenge, the
request contains this header. The default value is to send a non-authenticated request,
and if the response is an authentication challenge, only then send the credentials in a
After having sent the request to the HTTP server, the function waits for the response. The HTTP client parses the raw response and the function returns a representation of the response as a sequence. The sequence has an http:response element as the first item, which is followed by an additional item for each body or body part in the response.
<http:response status = integer message = string> <!-- Content: (http:header*, (http:body|http:multipart)?) --> </http:response>
The http:response element is the first item in the sequence returned by the function.
status attribute is the HTTP Status Code returned by the server,
message is the Reason Phrase coming with the Status-Line.
http:header elements are as defined for the request, but represent
instead the response headers. The http:body
and http:multipart elements are also like in the request, but
http:body elements must be empty.
Instead of being inserted within the http:response element, the content of each body is returned as a single item in the returned sequence. Each item is in the same order (after the http:response element) as the http:body elements. For each body, the way this item is built from the HTTP response is as follow.
status-only attribute has the value
false), the returned sequence will only contain the
http:response element (with the headers, but also the empty
http:body or http:multipart elements, as if
false), and the following items, representing the bodies
content are not generated from the HTTP response.
For each body that has to be parsed, the following rules apply in order to build the
corresponding XDM item. If the body media type is a text media type, the item is an
containing the body content. If the media type is an XML media type, the content is
parsed and the item is the resulting XDM
document-node. If the media type is an HTML type,
the content is
document-node. If this is a
binary media type, the content is returned as an
xs:base64Binary item. From the previous
rules, a result item can then be either a
document-node (from XML or HTML), an
xs:string, or a
When the type of a part is either XML or HTML, its body has to be parsed into a document node. If an error occurs whilst parsing the content, the error err:HC002 MUST be raised.
If the attribute
override-media-type is set on the request, its value is
used instead of the
Content-Type header returned by the HTTP server. If the
Content-Type header of the
response indicates a multipart type, the value of
override-media-type can only be a
multipart type, or
application/octet-stream (to get the raw entity as a
binary item). If it is not, the error err:HC003 MUST be raised.
In both requests and responses, Media Type strings are used to choose the way the entity content has to be serialized or parsed.
We define four different classes of Media Type, which are used for sending requests and receiving responses. The intent is to provide guidance as to handling the entity content with respect to its content type, but an implementation is permitted to deviate from those rules if it is obvious that a particular type should be treated in a specific way, typically this can be useful for binary types such as [[EXI]].
application/xml-external-parsed-entity, as defined in [[!rfc3023]] (except that
application/xml-dtdis considered a text media type). Media types ending with
+xmlare also considered XML types.
override-media-typewas not a multipart media type or
srcattribute on the body element is mutually exclusive with all other attribute (except the