RSS Validator

A Schematron Schema for RSS
 

Introduction

DTD based validation does not provide the kind of flexibility required in many applications. A DTD is limited in the structures that it can specify and cannot validate the contents of elements, e.g. to determine they have the correct length or format. A further limitation is that a validating parser, when encountering a validation error, typically just emits an often cryptic error message.

Schematron is a schema language that allows a document to be validated by testing it against a set of patterns (XPath expressions). Schematron validation rules allow the author to specify a helpful error message which will be provided to the user if an error is encountered.

The RSS Validator is a Schematron schema for RSS ('Rich Site Summary') language used to syndicate web content in applications such as My.Yahoo, My.Netscape, My.Userland and Meerkat.

The RSS Validator schema provides all the power of validating against the RSS 0.91 DTD, along with a much richer set of validation rules. These validation rules encompass all the constraints which cannot be expressed in the RSS DTD.

User Guide

A Schematron schema is used to generate an XSLT stylesheet which is used to actually apply the validation rules to the input document.

If you are interested in further developing the RSS Validator schema then follow the Developer User Guide. If you are interested in simply validating some RSS documents using the schema, then follow the Author User Guide.

Developer User Guide

You will need the following components: We'll assume for the following examples that you can invoke your XSLT processor using a batch file or shell script as follows:

xt input-document stylesheet output-document

If you have downloaded the schematron-basic implementation (and you must have downloaded the skeleton to accompany it), then you can generate the validating stylesheet as follows:

xt rss_validator.xml sch-basic.xsl validator.xsl

You can then run the validator against an RSS document as follows:

xt rss_doc.xsl validator.xsl [report.txt]

If you spot any errors in the validator, or add any new rules then please contact me. I'd welcome any feedback or contributions.

Author User Guide

You will need the following components: We'll assume for the following examples that you can invoke your XSLT processor using a batch file or shell script as follows:

xt input-document stylesheet output-document

You can then download the validator stylesheet and run it against an RSS document as follows:

xt rss_doc.xsl validator_text.xsl report.txt

The stylesheet produces a plain-text report. However if you want to have an HTML version of the report, take a look at schematron-report.

Download

The current version of the RSS Validator schema is 1.0.

Version History

1.0 Final 13th July 2000
Tided up line lengths to make the schema a bit more readable. Amended message text of for some assert elements to better fit the intended use. Tidied up structure to use clearer groupings. Altered reference for field lengths to point to Dave Winers document [2] rather than the Netscape [3] instructions. Added check to ensure that item elements only occur within channel elements
1.0 beta 4th July 2000
Added some additional contents.
1.0 alpha, 2nd July 2000
Added comments, to do list, references. Altered day element validation to explicitly check for named days as described in Dave Winers reference.

TODO List

References

The following web pages proved extremely useful whilst writing this schema: [Top]

Page Maintained by Leigh Dodds. Last Updated 13th July 2000