cl-cheshire-cat is a project for a CL redirection server.
The core of the server is based on Edi Weitz (http://weitz.de/) CL librairies, in particular:
- Hunchentoot (Web Server)
- CL-PPCRE
The persistence layer is ensured using CL-STORE (http://common-lisp.net/project/cl-store).
The recommended way to install cheshire is by using quicklisp as much as possible:
- Using
quicklisp
, you can install:
hunchentoot
cl-store
split-sequence
- You need to manually install
sb-daemon
If you use the provided cheshire.sh
and cheshire.lisp
, you also need to
install py-configparser
(available via quicklisp).
The recommended usage requires you to use SBCL (and optionnaly swank) and is divided in a three component process:
- The
cheshire.sh
script which is responsible for daemon management operations. - The
cheshire.lisp
script which is responsible for loading and starting the cheshire daemon. - The daemon itself.
You are encouraged to modify and adapt the first two component in the way most convinient to your own use of the Cheshire.
As mentionned, the default process is dependent on SBCL. However we would like Cheshire to be compatible with as many CL distributions as possible. Any adaptation of the starting process be compatible with a different CL distribution is welcome.
The daemon operations are performed using scripts/cheshire.sh
. The
first argument of the script is either start
, stop
, restart
or status
and makes the script perform the appropriate action. The script should always be
executed as root.
If you copy this script directly into /etc/init.d/
, there is a slight
possibility that the script will run when started as root, but not at boot time
(e.g. on Debian 6.0). In this case, you will need to specify locale and/or home
in the configuration file. That's because the environment is not exactly the
same one when the service is started at boot time.
Second, if you see that the error log is populated with messages "Cannot open
such file" or "Cannot create such directory" when you use daemonize and
privilege dropping, make sure all the directories Cheshire tries to right into
are owned by the appropriate user. This include in particular home/.cache
,
home/.slime
and their sub-directories.
The second argument is expected to be a configuration file. If none is given,
/etc/cheshire.conf
is the default configuration file.
cheshire.sh is looking for a pidfile=
entry in the configuration
file. If there is no such entry, /var/run/cheshire.pid
is used.
The required action is then performed by cheshire.sh
. If it needs to run the
server, it assumes that scripts/cheshire.lisp
is located as
/usr/share/cheshire/scripts/cheshire.lisp
unless it finds a system=
directive in the configuration file.
cheshire.lisp
is loading the configuration file and starting the
daemon. (Please refer to config/cheshire.conf
for the details of the
documentation). cheshire.lisp
expects the configuration file as its
first command-line argument and assumes that the cl-cheshire-cat system is
loadable via (asdf:load-system "cl-cheshire-cat")
.
The starting process includes:
- starting to listen on the specified port
- daemon bookeeping (debugging, dropping privileges, swank loading)
- loading the redirection rules
The privileges are dropped after Cheshire started listening. This is a limitation from usocket and hunchentoot. This means:
- you should not have any problem listening directly on the 80 port (or any other privileged port).
- the few first requests could happen to be served as root. Since the redirection specifications is not loaded untill the privileges are dropped and this issue can be prevented by a system firewall (like iptables), we don't consider it a big security issue.
If the recommended usage does not fit your use of Cheshire, feel free to adapt any of the previously described steps. They are made to be highly and easily customizable.
Note that the debugging bookeeping step in cheshire.lisp
is provided for
convinience. If you skip this part, Hunchentoot defaults will be used:
- the debugger will never be called in case of an error
- error pages will show complete error message and backtrace
The server is listening on the HTTP port (80), awaiting an HTTP query. Other ports will work, but you must keep in mind that no port rewritting facility is currently offered.
The server first search for a domain name rule to apply and then for a URI rewritting rule. The details of each rule is descript is the following section.
The rules are always checked in turn, if a rule matches, it's applied (and the search stop).
If no domain name matching rule is found and the domain name does not start with
www.
, the server will send a "301 Moved Permanently" to the same URL with an
additional www.
prefix to domain name. If the domain name already starts with
www.
, the server will return a "404 Not Found" error in order to avoid
redirection loop.
If a domain name rule matched, Cheshire then search its list of URI rule to apply any addition URI modification. If there is no URI matching rule and no domain rewritting in effect, the server will return a "404 Not Found" error in order to prevent redirection loops.
Domain name rules parameters are used as default for its associated URI rules.
The loop protections intend to protect you from mild obvious rule specification
errors. Their goal is not to prevent willingful redirection loops or
mischievious configurations. For example it is easy to trick Cheshire into
issuing infinite redirection loops with a domain name rule with no URI rule, key
(:exact "www.domain.example")
and replacement "www.domain.example"
. In other
words you are responsible for the correctness of your redirection
rules. Cheshire will not check them for you and there is no plan to do so in the
future.
Since Cheshire intends to be a production-grade product, we made our best to keep unspecified behavior implementation on a fail-early and safe basis.
For example, if you provide an invalid argument for a rule specification, we try to fail and send an error message when you create or update the rule rather than at apply time.
However, this behavior may not be always easy to have, don't forget that unspecified behaviour is still unspecified and may make Cheshire to crash on each and every request it receives.
Each rule is composed of two main elements:
- The rule's key is identifying whether the rule can be applied to a given URL. The key is itself composed of two elements:
- The kind specifies the algorithm used to match the rule.
- The match specification gives the information to match the rule against the URL.
- The rule's effect including the modification to apply to the matching URL, options for the HTTP Answer or subsequent rules to apply.
Three rules matching algorithms (kinds) are used:
exact
suffix
(matches if the end of the domain name matches)regex
Following the DNS specification ([RFC1035]), domain name matches are always case-insensitive.
If the rewrite specification is not nil, the domain will be rewritten.
If the rule is an exact
match, the rewritte specification is use to replace the
whole domain name.
If the rule is a suffix
match, the rewritte specification may contain once the
special substring "\1"
, which will be replaced by the prefix (non matching part)
of the original domain name.
If the rule is a regex
match, any string replacement accepted by
cl-ppcre:regex-replace
will be accepted and the result will be equivalent to
(cl-ppcre:regex-replace regex original-domain-name replacement)
. Function
designators are not allowed since they cannot be saved easily.
A list of URI rewritting rules can be used for each domain.
Three rules matching algorithms (kinds) are used:
exact
prefix
(matches if the beggining of the URI matches)regex
URI match are case-sensitive by default (this can be changed via regex flags).
URI replacements for exact
and regex
match occurs the same way as Domain
rewritting.
If the rule is a prefix
match, the rewritte specification may contain once the
special substring "\1"
, which will be replaced by the suffix (non matching part)
of the original URI.
Each redirection specification can also include redirection parameters.
The first parameter supported is http-code
and its default value is 301.
http-code
must be one of 300, 301 (default), 302, 303 or 307. This status code
will be the one used in the answer sent to the client. The behavior is
unspecified if an invalid status code specification is given.
The second parameter is protocol
and its default value is http
.
protocol
must be one of http
or https
and it is the protocol that will be
used as the redirection target.
The third and last parameter is port
and its default depends on the value of
protocol
(443 if protocol
is https
, 80 otherwise).
port
must be a valid port number (i.e. an integer between 1 and 63565).
Each rule can be given a list of query string manipulation to do. These operations are run before any rewritting occurs.
There are 5 operations which can be exectued on the list of get parameters:
The default behaviour is to leave the query string unmodified.
The first operation is the basic clearing of the whole query string in order to have a fresh start. Any operation executed before that one will be without effect.
You can choose to add any get parameter to the URL. If the parameter already exists, it will be twice in the query string, which may create unexpected behaviour on the other side of the redirection.
The new parameter's value can be:
- a constant string your giving to the operation
- the path of the query (before rewritting)
- the domain name of the query (before rewritting)
Parameters:
name
: the name of the paramter you will createvalue
: the value of the parameter (a string,:path
or:domain
)
Of course, if you can add, you can delete.
Parameter:
name
: the name of the parameter to delete
The value update is performed by applying a regular expression substitution to the value of the parameter.
Parameters:
name
: the name of the parameter to updatematch
: the regular expression used to match the old valuereplacement
: the replacement string used to defini the new value
Often, it's easier to rename a parameter than delete and re-create it.
Parameters:
name
: the old name of the parameternew-name
: the new name of the parameter
Management options can be setup when creating the server.
Currently two management options are supported:
admin-host
: The domain name used to manage the server (default is"management.invalid"
).admin-allowed
: A list of CIDR blocks. IP addresses in these blocks are allowed to manage the server (default is localhost only)
Each CIDR network specifications is a pair of two elements. The first one is an
IPv4 address (either a string in dotted notation "127.0.0.1"
or a vector of
four integers in host order #(127 0 0 1)
). The second is the prefix-length of
the CIDR block. If the second part is missing, its default value is 32.
The recommended tool to manage Cheshire is curl or another low level HTTP or TCP tool such as nc(1) or telnet(1).
The management API is splited in three parts:
- Global management
- Domain rules management
- URI rules management
Each operation is specified using three mecanisms:
- The URI of the operation (to choose what you want to do).
- The GET parameters (URL query string) to select the rule you want to manage.
- The POST parameters to provide the information required by the operation you want to perform.
Global management operations are impacting the behavior of the whole server.
Path: /save-rules
POST Parameter:
name
(optional): name of the file in which the rules will be stored. If the parameter is not provided, the file used is the one from which the rules have been pre-loaded (rules-file
).
The file will be stored in the directory specified as the rules-directory
configuration and with the crr
extension. If there is no such configuration,
the directory of the rules-file
configuration will be used.
Example:
POST /save-rules HTTP/1.1
Host: management.invalid
Content-type: application/x-www-form-urlencoded
name=my-bkp
Path: /load-rules
POST Parameter:
name
(optional): name of the file from which the rules will be loaded. If the parameter is not provided, the file used is the one from which the rules have been pre-loaded (rules-file
).erase-all
: If this parameter is not given, an error message will be sent. Since loading a new set of rules delete any other rule currently in effect, this confirmation is here as a security.
The file will be loaded from the directory specified as the rules-directory
configuration and with the crr
extension. If there is no such configuration,
the directory of the rules-file
configuration will be used.
Example:
POST /load-rules HTTP/1.1
Host: management.invalid
Content-type: application/x-www-form-urlencoded
name=my-bkp
erase-all=OK
The domain rule management operations are in the "folder"
/domain-name-rule/
. For each of the following rule, the path will
therefore be prefixed by /domain-name-rule
.
Path: /list
GET parameters (all optionals):
kind
: The kind of the domain rule key (should be one ofexact
,suffix
orregex
.match
: A regular expression applied on the domain rule match specification.replacement
: A regular expression applied on the domain rule replacement specification.
Returns the list of domain names rules matching the parameters. If a criteria is omitted all rules will match this criteria.
Example:
GET /domain-name-rule/list?kind=exact HTTP/1.1
Host: management.invalid
Path: /add
POST parameters:
kind
: The kind of match string. Must be one of "exact", "suffix" or "regex".match
: The match string used to check whether the rule matches the current URL.replacement
(optional): The replacement string used to modify the URL part matching the rule.http-code
(optional): The HTTP Status code to be sent along for this redirection.position
(optional): Position at which the rule will be inserted in the rule list. If none is provided, the rule is inserted at the beginning of the list.protocol
(optional): Protocol to use after the redirection. Must be one of http or https.port
(optional): Port number to use after the redirection.
Example:
POST /domain-name-rule/add HTTP/1.1
Host: management.invalid
Content-type: application/x-www-form-urlencoded
kind=exact&match=www.domain.example&position=3
Path: /remove
GET parameters:
kind
: The kind of the rule.match
: The match spec of the rule.
POST parameter:
confirmed
: This parameter must be non-nil.
Example:
POST /domain-name-rule/remove?kind=exact&match=www.domain.example HTTP/1.1
Host: management.invalid
Content-type: application/x-www-form-urlencoded
confirmed=OK
Path: /update
GET parameters:
kind
: The kind of the rule.match
: The match spec of the rule.
POST parameters (all optional):
new-kind
: The new kind to use for the rule from now on.new-match
: The new match spec to use for the rule from now on.new-replacement
: The new replacement to use for the rule from now on.new-http-code
: The new HTTP Status code to be used for this rule redirection from now on.
The query string operations management is in the "folder"
/query-string-updates/
. See Query string operation management for the details.
When applied to domain name rule, all these operations have two common GET parameters:
domain-name-kind
: The kind of the domain name rule.domain-name-match
: The match spec of the domain name rule.
The URI rule management operations are in the "folder"
/uri-rule/
. For each of the following rule, the path will
therefore be prefixed by /uri-rule
.
All this operations need to know on wich parent domain name rule they operate. Thus, each rule has two common GET parameters:
domain-name-kind
: The kind of the domain name rule.domain-name-match
: The match spec of the domain name rule.
Path: /list
GET parameters (all optionals):
kind
: The kind of the domain rule key (should be one ofexact
,prefix
orregex
.match
: A regular expression applied on the domain rule match specification.replacement
: A regular expression applied on the domain rule replacement specification.
Returns the list of URI rules matching the parameters. If a criteria is omitted all rules will match this criteria.
Example:
GET /uri-rule/list?domain-name-kind=exact&domain-name-match=www.domain.example&kind=exact HTTP/1.1
Host: management.invalid
Path: /add
POST parameters:
kind
: The kind of match string. Must be one of "exact", "suffix" or "regex".match
: The match string used to check whether the rule matches the current URL.replacement
(optional): The replacement string used to modify the URL part matching the rule.http-code
(optional): The HTTP Status code to be sent along for this redirection.position
(optional): Position at which the rule will be inserted in the rule list. If none is provided, the rule is inserted at the beginning of the list.protocol
(optional): Protocol to use after the redirection. Must be one of http or https.port
(optional): Port number to use after the redirection.
Example:
POST /uri-rule/add?domain-name-kind=exact&domain-name-match=www.domain.example HTTP/1.1
Host: management.invalid
Content-type: application/x-www-form-urlencoded
kind=prefix&match=%2Ffoo&position=2
Path: /remove
GET parameters:
kind
: The kind of the rule.match
: The match spec of the rule.
POST parameter:
confirmed
: This parameter must be non-nil.
Example:
POST /uri-rule/remove?domain-name-kind=exact&domain-name-match=www.domain.example&kind=exact&match=www.domain.example HTTP/1.1
Host: management.invalid
Content-type: application/x-www-form-urlencoded
confirmed=OK
Path: /update
GET parameters:
kind
: The kind of the rule.match
: The match spec of the rule.
POST parameters (all optional):
new-kind
: The new kind to use for the rule from now on.new-match
: The new match spec to use for the rule from now on.new-replacement
: The new replacement to use for the rule from now on.new-http-code
: The new HTTP Status code to be used for this rule redirection from now on.
The query string operations management is in the "folder"
/query-string-updates/
. See Query string operation management for the details.
When applied to domain name rule, all these operations have two more common GET parameters:
uri-kind
: The kind of the domain name rule.uri-match
: The match spec of the domain name rule.
Query string management is currently very simple.
Path: /list
GET parameters (all optionals):
operation
: The operation filter of the domain rule key (should be one ofclear
,add
,delete
,update
orrename
).- Any combination of
name
,new-name
,match
,replacement
: a regex on the corresponding field.
Returns the list of query string operations matching the parameters. If a criteria is omitted all rules will match this criteria.
Example:
GET /domain-name-rule/query-string-updates/list?domain-name-kind=exact&domain-name-match=www.domain.example&operation=clear HTTP/1.1
Host: management.invalid
Path: /add
POST parameters:
operation
: The operation filter of the domain rule key (should be one ofclear
,add
,delete
,update
orrename
).- Any combination of
name
,new-name
,match
,replacement
which is valid for this operation. position
(optional): Position at which the operation will be inserted in the operation list. If none is provided, the operation is inserted at the beginning of the list.
When you create an add
operation, you should provide one of the following
parameters (they are mutually exclusive):
value
: Its value is the value that will be given to the new GET parameterpath-as-value
: If present, the path will be used as value for the new GET parameterdomain-as-value
: If present, the domain name will be used as value for the new GET parameter
Example:
POST /uri-rule/query-string-updates/add?domain-name-kind=exact&domain-name-match=www.domain.example&uri-name=%2Ffoobar%2F&uri-kind=prefix HTTP/1.1
Host: management.invalid
Content-type: application/x-www-form-urlencoded
operation=add&name=redirect-from&domain-as-value=t
Path: /remove
GET parameters:
operation
: The kind of the rule.name
: The name of the query parameter on which the operation is applied.match
: The match specification used in a query string update operation.
If name
or match
is not supported by the operation you want to delete, you
must omit them.
POST parameter:
confirmed
: This parameter must be non-nil.
Example:
POST /uri-rule/query-string-updates/remove?domain-name-kind=exact&domain-name-match=www.domain.example&uri-name=%2Ffoobar%2F&uri-kind=prefix&operation=rename&name=domain-name-match HTTP/1.1
Host: management.invalid
Content-type: application/x-www-form-urlencoded
confirmed=OK
The data are stored in memory by a hierarchy of classes, check redirection-rule
and subclasses.
-
The implementation restricted part should be extracted away in another package/project (TODO).
-
Because of the inherent multiple domain names property of the project, Cheshire supports only HTTP, no HTTPS now or in the forseenable future is planned.
-
No offline rule management.
This software has been originally developed by Mathieu Lemoine [email protected] and is the sole property of Mentel Inc. (www.mentel.com). This software is distributed under the terms of the 3-clause BSD, stated as follow:
Copyright © 2012, Mathieu Lemoine [email protected], Mentel Inc. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
- Neither the name of "Mentel Inc." nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.