W3C HTML Validation Service: Source Code
The source code for the W3C HTML Validation Service is available under the terms
of the W3C
Software Copyright.
The purpose of making the source code available is partly to allow
others to set up mirrors of the service (either publicly or within
an intranet behind a firewall), but also to allow us to collaborate
on making the service better -- there are many ways the service could be improved but I
only have a small amount of time to work on it myself.
You can retrieve the code a number of ways:
- If you just want to glance at the code, or see its revision
history, you can browse it on the web.
The most interesting files are currently a
CGI script called "check" that does pretty much everything,
and possibly also the
httpd.conf. (select the topmost revision numbers on these pages to see
the most recent revision of each file.)
If you want a copy of all the files that make up the
service, you can grab a tar
ball (~1.5M, updated every day at 06:00 ET.)
If you intend to actively mirror the code
and/or contribute
patches to the code, you should install and become familiar with CVS; this is the
tool we use for revision control (it is also used by the Apache and Mozilla developers,
and is generally a good thing to get to know.) More information
on CVS is available courtesy of Pascal Molli.
Our CVS base is available read-only, using CVS pserver authentication
a la:
bash$ export
CVSROOT=":pserver:anonymous@dev.w3.org:/sources/public"
bash$ cvs login
(Logging in to anonymous@dev.w3.org)
CVS password: anonymous
bash$ cvs get validator
cvs server: Updating validator
cvs server: Updating validator/htdocs
U validator/htdocs/about.html
...
Prerequisites
Before you will be able to get the code to run, you will need a few
things already installed on your system:
- A Unix-like operating system. It may work with Windows NT or
other systems, but I haven't tried it yet. It has worked on Linux,
Solaris, FreeBSD, and Digital UNIX. If anyone tries it on other
systems and gets it to work, please let
me know (and send me patches, if
they are needed to get it to work eleswhere.)
- SP version 1.2.1 or higher.
SP is the SGML parser used by the service. More recent versions than
1.2.1 will likely work, but I haven't tried them yet. Patches will
likely be necessary if the output format has changed even slightly
since version 1.2.1.
- A collection of DTDs and other SGML
files to validate against. You don't strictly need these
on your system since SP will retrieve them off the Web if you use URIs
in your doctypes, but you probably want them locally for efficiency.
(You don't need to download this tar ball if you mirror everything
using CVS.)
- libwww-perl4:
this is a perl library I use to retrieve documents from other Web sites
before validating them. I hacked this slightly before using it in the
validation service; I would really like to replace this with the more
modern LWP module for Perl5;
if anyone can do this and supply patches, I would be very grateful!
- A web server: I am currently running Apache version 1.3.1, but likely any
version will work, and other httpd's may work as well.
- Perl: the main
CGI script that does everything is written in Perl; as far as
I know it will work with Perl 4 since my knowledge of Perl is currently
circa 1994, but you should really be using Perl 5. (My version of Perl
is 5.004_04 or higher.)
If anyone tries to use the code, and succeeds (or fails), please let me know!
Gerald Oskoboiny
$Date: 1999-02-02 21:52:23 $