W3C Jigsaw     


ProxyDispatcher

The ProxyDispatcher is a filter that allows some rule to be applied to some given request before the HTTP client side API emits out a request. The set of rules can be extended in Java, check below for the currently defined rules.

Warning: When configuring that filter along with Jigsaw's proxy module, you will need to manually edit Jigsaw's property file (usually found at config/http-server.props, otherwise, you know what we are talking about).

ProxyDispatcher Rules

The basic syntax for the ProxyDispatcher rule file is captured by the following BNF:
rule-file := (record)*
record    := comment | rule
comment   := '#' <any chars up to EOL>
rule      := rule-lhs SPACES rule-rhs
rule-lhs  := token | default
default   := 'default'
rule-rhs  := forbid | direct | proxy | authorization | proxyauth
forbid    := 'forbid'
direct    := 'direct'
proxy     := 'proxy' SPACES url
authorization := 'authorization' SPACES user SPACES password
proxyauth := 'proxyauth' SPACES user SPACES password SPACES url
user      := token
password  := token
EOL       := '\r' | '\r\n' | 'n'
SPACES    := (' '|'\t')+
A sample ProxyDispatcher rule file looks like:
# Sample ProxyDispatcher rule file
# --------------------------------

# Make all access to US through us.proxy.com
edu proxy http://us.proxy.com:8080/
org proxy http://us.proxy.com:8080/

# Accesses to french site are direct (no proxy)
fr  direct

# Accesses to 18.59.*.* network are direct
18.59 direct

# Accesses to the protected site gets decorated with auth infos:
www.protected.com authorization joe-user joe-password

# Forbid accesses to some sites

www.evilsite.com forbid

# Access this site through myproxy.com with proxy authentication
www.somesite.org proxyauth joe-user joe-password http://myproxy.com:8008/

# force all other request to go through world.proxy.org
DEFAULT proxy http://world.proxy.org:8080/
The rule matching algorithm matches the host name part of urls, or the numeric part, if the address is numeric, no name resolution. The matching algorithm tries to find the best match, starting with the most significant part of the URL (in www.foo.com, com is the most significant part, in 18.23.0.22, 18 is the most significant part) and then walking toward the best match, hence host names are implicitly "terminated" by * if you will.  In the above example, any access to www.foo.fr/x/y would be handled by:
  1. Reverting the host name components: fr foo www
  2. Looking for a match on fr (found)
  3. Looking for a match on fr foo (not found)
In that case the rule found at step 2 is the most specific, and gets applied.

This examples is self explanatory, and illustrates all the rules currently handled by the filter. When used in conjunction with the ICP filter, you can get a very powerful caching hierarchy.

Note also that the underlying implementation of the rule matching algorithm allows a large number of rules which can lead to a big static routing table.


Properties

The ProxyDispatcher defines the following properties:
org.w3c.www.protocol.http.proxy.rules
semantics
The location of the rules for the ProxyDispatcher filter. The rule file expresses a rule to be applied when fetching a document, see the rule syntax for more informations.
type
This property can be either a full URL or a filename.
default value
This property has no default value, and must be set for the filter to be activated.

org.w3c.www.protocol.http.proxy.debug
semantics
Should the filter emit debugging traces ? When set to true this will make the filter tells which rule it applies on which fetched URL.
type
A boolean property
default value
This property defaults to false.

ylafon
$Id: w3c.www.protocol.http.proxy.ProxyDispatcher.html,v 1.4 1997/09/22 09:02:23 ylafon Exp $