From W3C Wiki
W3C MarkUp Validator
In a Nutshell
The W3C Markup Validator is an open source software and free W3C service helping Web Document authors fix errors in the markup of their HTML, XHTML, MathML etc. resources.
W3C is hosting and leading this project, and has been doing so since 1994 (?).
Source Code / Download
THe validator has an automated test suite in python, needs the jinja2 library to run. The list of test cases generally used is the one on the dev server. Instructions on running the harness are similar to the ones for the link checker test suite
Installation Notes / Discussions
 for the main installation instructions. That doc points to specific installation instructions, for Mac OSX and windows, in particular.
Discussion / Development / Feedback Fora
User Mailing-list firstname.lastname@example.org
- ~200 subscribers
- ~20 very active participants (suggestions, user support)
- Healthy/Active user community
- 100~200 messages/month
Developer Mailing-list email@example.com
- shared with other projects, hackers' forum
- ~40 subscribers
- ~30 messages/month
Perl. Test suite is Python. Development involves good understanding of HTML/XML, and occasional deep understanding of SGML/XML DTDs.
Medium. This is a production service trusted by many. Changes are hard to add. Code is showing some signs of age.
We receive a few bug reports every month (<10) and patches now and then (1 every month?).
Development currently (2008) mostly involves adding specific checks on top of DTD validation, performance improvements, and syncing of DTDs to follow the development of the XHTML specs.
Plan and Vision
For a snapshot of the roadmap at the time of the latest release, see http://validator.w3.org/todo
- Provide the web with a one-stop service for Web Quality check
- Help raise quality for (m)any kind(s) of Web content
- Build a positive culture of Web Quality
- Future-proof our services (new formats, new usage)
- Leverage Communities energy
- Remain the trusted source by professionals
- Find the right balance between accuracy and user-friendliness
- Multi-engine validator. The current validator is mostly based on an DTD parser, with an XML parser used only for some checks. The current development version plugs into an html5 parser for the validation of HTML5 content. In the future, other engines could be used to check compound XML documents (with NVDL+relax, XML Schema, Schematron - using e.g the relaxed engine)
- Mulitilingual tool. The Markup Validator receives 1M requests per day, and is only in English. Making it multiligual would make the tool easier to use for web developers and designers worldwide. Although this may be technically tricky (given the number of message/engine sources), the community would be very excited in participating in the translation effort.
- Site-wide services. The markup validator currently checks a single page. Some companion software (such as the log validator) could be made into a web service to provide crawling, batch validation, scheduled checks etc.
- Check beyond markup: This may be in the roadmap for Unicorn rather than the markup validator, but it fits in the "long-term" vision of developing the W3C Web Quality services. Checking of RDDL, RDFa, microformats and other rich markup are in scope. Many other checks could be added to the validators, such as:
- document cacheability
- spell checking
- semantic extraction
- accessibility evaluation
- Less finger pointing, more problem solving
Most of our tools, and especially the "star" HTML validator, have a binary "valid/invalid" way of presenting their results. While this is useful for some, it tends to make people look away from the "big picture" of web quality. A new one-stop quality checker could help bring a paradigm shift by showing diverse aspects of web quality, while systematically suggesting solutions for every problem. This would involve working with designers to find ways to present aggregated quality information in a clear and positive manner.