MarkupValidator

From W3C Wiki


W3C MarkUp Validator

In a Nutshell

The W3C Markup Validator is an open source software and free W3C service helping Web Document authors fix errors in the markup of their HTML, XHTML, MathML etc. resources.


Project Logistics

Project Host

W3C is hosting and leading this project, and has been doing so since 1997.

Online Services

The main public service is at [1], with a developers' instance at [2]

Source Code / Download

[3] for source download options. The code is hosted on W3C's public CVS repository

Test Suite

THe validator has an automated test suite in python, needs the jinja2 library to run. The list of test cases generally used is the one on the dev server. Instructions on running the harness are similar to the ones for the link checker test suite

Installation Notes / Discussions

[4] for the main installation instructions. That doc points to specific installation instructions, for Mac OSX and windows, in particular.

Discussion / Development / Feedback Fora

User Mailing-list www-validator@w3.org

  • Archive
  • ~200 subscribers
  • ~20 very active participants (suggestions, user support)
  • Healthy/Active user community
  • 100~200 messages/month

Developer Mailing-list public-qa-dev@w3.org

  • Archive
  • shared with other projects, hackers' forum
  • ~40 subscribers
  • ~30 messages/month

Bug Tracking

Public W3C Bugzilla and a specific feedback page

Development

Languages Used

Perl. Test suite is Python. Development involves good understanding of HTML/XML, and occasional deep understanding of SGML/XML DTDs.

Development Speed

Medium. This is a production service trusted by many. Changes are hard to add. Code is showing some signs of age.

We receive a few bug reports every month (<10) and patches now and then (1 every month?).

Development currently (2008) mostly involves adding specific checks on top of DTD validation, performance improvements, and syncing of DTDs to follow the development of the XHTML specs.

Plan and Vision

For a snapshot of the roadmap at the time of the latest release, see http://validator.w3.org/todo

High-Level Objectives

  • Provide the web with a one-stop service for Web Quality check
  • Help raise quality for (m)any kind(s) of Web content
  • Build a positive culture of Web Quality
  • Future-proof our services (new formats, new usage)
  • Leverage Communities energy
  • Remain the trusted source by professionals
  • Find the right balance between accuracy and user-friendliness

Roadmap

  • Multi-engine validator. The current validator is mostly based on an DTD parser, with an XML parser used only for some checks. The current development version plugs into an html5 parser for the validation of HTML5 content. In the future, other engines could be used to check compound XML documents (with NVDL+relax, XML Schema, Schematron - using e.g the relaxed engine)
  • Mulitilingual tool. The Markup Validator receives 1M requests per day, and is only in English. Making it multiligual would make the tool easier to use for web developers and designers worldwide. Although this may be technically tricky (given the number of message/engine sources), the community would be very excited in participating in the translation effort.
  • Site-wide services. The markup validator currently checks a single page. Some companion software (such as the log validator) could be made into a web service to provide crawling, batch validation, scheduled checks etc.
  • Check beyond markup: This may be in the roadmap for other tools rather than the markup validator, but it fits in the "long-term" vision of developing the W3C Web Quality services. Checking of RDDL, RDFa, microformats and other rich markup are in scope. Many other checks could be added to the validators, such as:
    • document cacheability
    • spell checking
    • semantic extraction
    • accessibility evaluation
  • Less finger pointing, more problem solving
Most of our tools, and especially the "star" HTML validator, have a binary "valid/invalid" way of presenting their results. While this is useful for some, it tends to make people look away from the "big picture" of web quality. A new one-stop quality checker could help bring a paradigm shift by showing diverse aspects of web quality, while systematically suggesting solutions for every problem. This would involve working with designers to find ways to present aggregated quality information in a clear and positive manner.

See also