Datalidator is a flexible, object-oriented Python library for parsing and validating untrusted input data.
One of its most prominent characteristics is that it is able to parse various kinds of input data: configuration files, user inputs, web API outputs etc.
The main building blocks of this library's functionality are the following 3 types of objects:
- Blueprints – The purpose of blueprints is to safely and reliably parse untrusted[1] input data to a
specific blueprint's output data type and raise an appropriate exception if it is not possible (i.e. blueprints
must be able to react to input data of any type and value without getting into an unexpected state under regular
conditions). If applicable, blueprints should also be capable of running the parsed data through filters and
validators. All blueprints must implement
BlueprintIface
. - Filters – The purpose of filters is to modify data already parsed by a blueprint (i.e. NOT the untrusted input
data!) in some way without changing their data type. Filters are normally not used directly, but instead through a
blueprint. All filters must implement
FilterIface
. - Validators – The purpose of validators is to check whether data already parsed by a blueprint and optionally
filtered by "zero or more" filters (i.e. NOT the untrusted input data!) meet certain requirements and if not, raise
DataValidationFailedExc
. Validators are normally not used directly, but instead through a blueprint. All validators must implementValidatorIface
.
1 By untrusted data, data that were acquired in a way that does not allow arbitrary code to be put into the application are meant (e.g. deserialized JSON document or HTTP POST body). Therefore, for example, if a blueprint is used on a malicious unpickled object, the malicious code can get executed by the blueprint!
All blueprints, filters and validators also implement DatalidatorObjectIface
.
See the class hierarchy document to find out what blueprints, filters and validators come with this library.
Blueprints, filters and validators may raise subclasses of the following exception and error base classes:
- DatalidatorExc – A superclass of all exceptions raised by Datalidator. One should catch exceptions that inherit from this class in the vast majority of cases because they can commonly occur even when one is using the library in a completely correct way – for example when invalid input data are passed to a blueprint (which is something that will happen if this library is used to handle untrusted user input in any way).
- DatalidatorError – A superclass of all errors raised by Datalidator.
One should not catch errors that inherit from this class in most cases because they can occur only when
the library is used incorrectly, for example when an invalid argument is passed to a blueprint's initializer, and
not for example as a result of invalid input data passed to a blueprint (this is what
DatalidatorExc
is here for). Therefore, these errors are not mentioned in methods' docstrings.
See the exception hierarchy document to find out what exception and error classes come with this library.
- Next chapter: 2. Using Blueprints