Observations from our initial https redirection tests

We recently announced that we are planning to start redirecting all of www.w3.org to https, as is commonly done elsewhere.

To get an understanding of whether this change is feasible for our site and what issues might arise, we conducted some limited tests of this configuration change, on August 1 and August 18-19.

Here are some notes on what we have learned so far, and answers to some questions we have received.

We receive a lot of automated requests for machine-readable resources on our site and have for many years, see for example this blog post from 2008. Due to the huge amount of traffic (hundreds of millions of requests per day) and the generic user-agent headers that are commonly in use (for example Java/xx), it’s hard to identify the source of most of this traffic. Also the generic user-agent strings make it difficult to do targeted outreach to the developers of the software making these requests.

Therefore we decided to do some limited tests of redirecting our entire site to https so any issues could be discovered and understood. We weren’t sure if this would only impact a handful of people who could easily adapt with some simple configuration changes, or if it had the possibility of being more disruptive.

During our initial tests we heard from a few people that this was causing issues with their systems that make automated requests to our site, for example when doing XML Schema validation. We are hoping these systems can be reworked to either follow the redirects to https, or use an XML catalog to keep local copies of any files needed to avoid making unnecessary requests to our site.

Questions we have received include: what action are we expecting from Web developers? Is it necessary to update all references starting with http://www.w3.org/ to https?

In general that is not necessary, and in fact in many cases those references to our site starting with http://www.w3.org/ must be preserved exactly as is, for example in a reference to an XML namespace that must be an exact match for a given string.

If you maintain a software system that retrieves resources from www.w3.org, please check whether it has the ability to handle redirects and https and update the software if needed. Also consider carefully whether you want to keep this dependency on our site or if it would be worthwhile to rework your systems to remove it, for example using an XML catalog. We do our best to keep our systems available and performant but we have occasional service outages like any other site. We expect most people would not want their production systems to be impacted by issues with our site.

We plan to continue limited tests of this change to our site over the coming weeks and months to gather more feedback in order to understand its impact before deploying it more permanently. Depending on the results of these tests we may decide to defer this change until more software can be updated, or deploy it with specific exceptions for example continuing to serve .xsd files via HTTP while redirecting the rest of the site.

To stay informed of future tests and other updates to our systems please stay tuned to our systems status page.

10 thoughts on “Observations from our initial https redirection tests

  1. Hi, We are using PHP soapclient for our provision service of metaswitch system.

    ($eas_client = new SoapClient(‘easwsdl/ShService.wsdl’);)

    Currently the W3C redirect will stop our provision service of metaswtich webservice, Could you help to stop the extended test that will stop on Sept 3 17:00 UTC time before we can solve this problem.

    “SOAP-ERROR: Parsing Schema: can’t import schema from ‘http://www.w3.org/2005/05/xmlmime’

    1. Hi, which version of PHP are you using? I did a local test of SoapClient in php 7.4 and it appears to be able to handle redirects and https.

      I found a copy of ShService.wsdl online and each of the namespace and schema references listed there appear to be redirecting to https:

      <wsdl:definitions targetNamespace=”http://www.metaswitch.com/ems/soap/sh”

      <xs:schema elementFormDefault=”qualified”

      Have you researched how you might remove this dependency on our site, for example by caching the files locally or using an XML catalog?

  2. I am using java 1.8.0_211(build 1.8.0_211-b12)

    At the moment (11/09/2022) validation does not work.
    error: Caused by: org.xml.sax.SAXParseException; systemId: http://www.w3.org/2005/05/xmlmime; lineNumber: 1; columnNumber: 1; Premature end of file.

    –°an you suggest how to fix this error?
    PS: it’s old legacy code

  3. attualmente il plugin di maven “jaxws-maven-plugin” produce “schemaLocation=”http://www.w3.org/2005/05/xmlmime”” in wsgen ma il goal wsimport va in errore sembra che l’indirizzo corretto sia “schemaLocation=”https://www.w3.org/2005/05/xmlmime”… come si fa a cambiare in compilazione http in https per i mime…..wsgen genera codice wsdl e xml con il mime http://www.w3.org/2005/05/xmlmime” adesso servirebbe https://www.w3.org/2005/05/xmlmime” partendo da classe java e annotada con @WebServices vengono prodotti wsdl e xml con i mime che hanno per schemalocation http invece di https

      1. Hello Ryan, I see that you are eager to find a solution to this problem as well.

        I’ve just came across it and since it’s been a few weeks since you commented this I was hoping that you managed to solve this problem. Have you found a sulution?

  4. +1

    We are facing same issue, exactly with https://www.w3.org/2005/05/xmlmime.

    Our webapp does a request for a third party wsdl which includes a <xsd:import… which in turn includes "”, the java code throws the Exception “org.xml.sax.SAXParseException: Premature end of file.
    at org.apache.xerces.parsers.DOMParser.parse(DOMParser.java:245)
    at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:298)”


Leave a Reply

Your email address will not be published. Required fields are marked *