Slug: Configuration

This page contains some notes on how to configure the Slug crawler.

The Configuration File

Slug requires a configuration file in order to configure a number of settings that describe how the crawler will operate. Collectively these settings are known as a profile.

These settings include details such as:

The Slug distribution includes a sample config file config.rdf that demonstrates how to configure all of the current components.

The configuration file is expressed as RDF/XML. A given configuration file may contain entries for more than one profile. Therefore when running the scutter one must provide the identifier of a Scutter described in the configuration. This is specified with the -id parameter, see Running the Scutter.

The Configuration Schema

The complete schema for the Scutter configuration is available in etc/schema/config.rdfs in the distribution. It is also available online

The namespace URI is

The preferred namespace prefix is slug.

The following sections describe some of the key classes and relationships.


The slug:Scutter class describes an individual crawler. A given configuration file may describe more than one crawler.

Configuration Example

For now see config.rdf for example configurations.

Image courtesy of Elroy Serrao.