Redirecting to https on all of www.w3.org
W3C's main web site www.w3.org
has been available via https
for over a decade, but until now we have not been redirecting all requests to https
as is commonly done on most other sites.
The primary reason for this is that we wanted to avoid causing issues for software requesting machine-readable resources from www.w3.org
such as HTML DTDs, XML Schemas, and namespace documents.
We believe enough time has passed for most such software to have been updated to handle redirects and https, so we are planning to start redirecting all requests received over http
to https
within a month or two.
In order to discover any potential remaining issues and to give some advance notice in case there are software systems that still have issues with redirects and https, we plan to conduct some limited tests before fully deploying this change to our site, where we redirect all http traffic to https for a few hours at a time.
The first such test is planned for Monday August 1, for 8 hours starting at 14:00 UTC (14:00 UTC to 22:00 UTC)
Update 16 Aug 2022: The second test is planned to run from Thursday August 18 to Sunday August 21, for 72 hours starting at 14:00 UTC on Thursday.
Update 19 Aug 2022: We ended the second test early, at 17:30 UTC today due to several complaints that this change was impacting production services. We plan to conduct another test in two weeks, for 48 hours starting at 17:00 UTC on Sep 1, ending at 17:00 UTC Sep 3. If you have dependencies on our web site in your production services please work to remove them, or update them to handle redirections and https.
If you have any questions or comments about this planned change, please post a comment here or contact us by email at sysreq@w3.org
The URL http://www.w3.org/2001/xml.xsd and
https://www.w3.org/2001/xml.xsd
are now redirecting an HTML page instead of the XSD file.
+1 to this, it is breaking xml schema validation in our codebase
The XSD file is still being returned:
$ curl -s -i https://www.w3.org/2001/xml.xsd
HTTP/2 200
last-modified: Wed, 21 Jan 2009 22:06:40 GMT
cache-control: max-age=7776000
expires: Thu, 27 Oct 2022 06:50:54 GMT
content-type: application/xml
<?xml version='1.0'?>
<?xml-stylesheet href="../2008/09/xsd.xsl" type="text/xsl"?>
<xs:schema targetNamespace="http://www.w3.org/XML/1998/namespace"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns ="http://www.w3.org/1999/xhtml"
xml:lang="en">
<xs:annotation>
<xs:documentation>
<div>
<h1>About the XML namespace</h1>
[...]
It might look like a web page due to the XSL stylesheet:
$ curl -s -i https://www.w3.org/2008/09/xsd.xsl
HTTP/2 200
date: Mon, 01 Aug 2022 18:39:04 GMT
last-modified: Fri, 04 May 2012 01:28:13 GMT
content-length: 31910
content-type: application/xslt+xml
<!DOCTYPE xsl:stylesheet [
<!--*
<!DOCTYPE xsl:stylesheet PUBLIC 'http://www.w3.org/1999/XSL/Transform'
'../../People/cmsmcq/lib/xslt10.dtd' [
*-->
[...]
xsd.xsl: format an XSD schema document for simple display in a Web browser.
It broke our validations too.
We had this error
Caused by: org.xml.sax.SAXParseException; systemId: http://www.w3.org/2001/xml.xsd; lineNumber: 1; columnNumber: 1; Premature end of file.
I replaced http://www.w3.org/2001/xml.xsd by https://www.w3.org/2001/xml.xsd and it is working now.
I thought the xsd would be returned like this URL
https://www.w3.org/2001/03/xml.xsd
Thanks!
I think the issue then is the https redirect itself, and changing the references to https causes its own issues. Getting the error "The namespace of element 'schema' must be from the schema namespace, 'http://www.w3.org/2001/XMLSchema'."
What software are you using to do the XML schema validation? Is it a recent version?
The javax.xml.validation package included with JDK 11
+1 our schema also broke with this change. We are using javax.xml.validation.SchemaFactory with XMLConstants.W3C_XML_SCHEMA_NS_URI, on Java 8.
do we need to update all http reference to https?
Can you please tell me when this will go back to HTTP:
Can we get this back and not wait until Sat
Getting the same errors. We are using an old schema for XACML, which https://github.com/wso2/balana uses for policies. The XML/XSD part of that code hasn't been touched in a long time, so changing to https breaks our code.
Unfortunately, Apache xerces, a popular Java XML parser, doesn't seem to work with w3.org on https. We're scrambling to reenable XML parsing so our users can log in, but it seems like this core Java library is standing in the way.
Even if a new version of xerces is released that supports https://www.w3.org- it's going to be next to impossible to get that out to everyone who needs it.
Of course, I could be completely wrong.
https://github.com/apache/xerces2-j/blob/trunk/src/org/apache/xerces/impl/xs/SchemaSymbols.java#L39
Thanks for the report. It seems suboptimal for W3C's web site to be in the critical path to users being able to log in to your site. Is that something you can fix? If so, we would appreciate if you could share how you fixed it.
It didn't seem suboptimal until yesterday! ;)
In the short term, we just stopped validating our xml- which got us out this jam, but comes with other problems. We're discussing options, but it seems like we'll most likely move to referencing a local copy of the schema.
I agree that external dependencies are a liability. It _does seem_ that Apache xerces suggests this workflow by they way that code is designed. And, by extension, Java.
(thanks for the dialog, by the way)
Sorry, but libxml2 doesn't seem to be able to resolve HTTPS redirects. I happened to be testing for the first time in a long time some XML parsing, and I happened to do so on August 18! Just my luck! I think you'll need to work with library maintainers to fix this before implementing this permanently.
Unfortunately we don't have the resources to track down all the implementations that might be hitting our site and work with the maintainers to fix things, hence this staged rollout to give people some time to adapt.
I see this issue with libxml2 has already been reported https://gitlab.gnome.org/GNOME/libxml2/-/issues/160 (and I see you just commented there as well – thanks)
Hello,
We use libxml2 for XML DTD parsing/validation (via Perl), and we're experiencing this same problem. A couple users noticed the problem back in August, didn't report it, and "it went away", so they thought nothing of it.
Today, more users are reporting this problem, and I stumbled across this blog post. This is the error we're receiving:
I/O error : failed to load external entity "http://www.w3.org/2003/entities/iso8879/isolat1.ent"
%isolat1;
^
http error : Unknown IO error
We rely heavily on W3C websites. This recent change/testing is a huge problem for us. I see that LibXML2 likely won't be changed anytime soon. Can you please consider rolling this back from HTTPS to HTTP?
Thank you!
Thanks for the information and update.
We are getting issues building with the latest Maven jaxws-maven-plugin plugin, which doesn't seem to handle the redirects. Referring our wsdl definitions to https://www.w3.org/2001/XMLSchema.xsd still fails as this xsd above still has references to non-ssl schemas like http://www.w3.org/2001/xml.xsd.
We'll see if we can come up with work-arounds, but would it be possible to clean up the schema to refer to ssl locations only please? Many thanks.
We are building jaxb models for an old soap service as part of our build, and the xjc that comes with Java 11 isn't capable of following these redirects:
[ant:xjc] [ERROR] Premature end of file.
[ant:xjc] line 1 of http://www.w3.org/2001/xml.xsd
[ant:xjc]
FAILURE: Build failed with an exception.
Luckily we could update the reference to xml.xsd in our configuration, and now we don't include a file from "anyone" in our build but specifically one from you.
This is also breaking Windows drivers Static Driver Verifier builds.
https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/static-driver-verifier
```
[INFO] Validating XML against schema: C:\EWDK\Program Files\Windows Kits\10\TOOLS\SDV\smv\bin\Config.xsd
Unhandled Exception: System.Xml.Schema.XmlSchemaValidationException: The 'http://www.w3.org/XML/1998/namespace:base' attribute is not declared.
...
```
We ended the second test early, at 17:30 UTC today due to several complaints that this change was impacting production services. We plan to conduct another test in two weeks, for 48 hours starting at 17:00 UTC on Sep 1, ending at 17:00 UTC Sep 3. If you have dependencies on our web site in your production services please work to remove them, or update them to handle redirections and https.
This change is breaking our remote signature web client service to a third party service vendor due to Apache xerces problems. The Sep 1 test appears not to have finished on September 3 at 17:00 as indicated, but is still active. This is blocking functionality for many of our customers.
We extended the Sep 1-3 test due to lack of issues reported. https://status.w3.org/incidents/j1gjbnnd4rkk
Could you please provide more info about your system? Which URL(s) is it accessing on www.w3.org? Have you researched how you might remove this dependency on our site, for example by caching the files locally or using an XML catalog? You are welcome to send details to sysreq@w3.org if you would rather not post them here.
We have the same issue :x
Same issue here : it is blocking us for our development.
Same issue, we are not able to build the web services
Confirm. Not work validation for apache 8.5+Java 1.8.
Are you planing to finish this testing at monday?
Hi Gerald,
Could you please tell me if you planning a new test phase in the near future? -if yes, when?
Thanks
We have the same issue :x
When will this change be permanently switched to HTTPS? Any plan to turn on the test again? I believe this change will impact many applications and APIs, not only XML related.
Was this change made permanent recently? Our apps are unable to resolve http://www.w3.org/2001/XMLSchema.dtd as of 12/10/22.