This is an alpha feature. Please do not rely on it. If you find issues with it, please report them.
Violation is invoked using the violate
command. An output directory must be specified rather than a single file.
- For each rule
R
:- Create a new version of the profile where
R
is wrapped in aViolationConstraint
. A violation constraint works similar to a not constraint except thatand
conditionals are treated differently (see below). - Create a decision tree from that profile, and pass it to the generator as normal.
- Write the output to a file in the output directory with a numerical file name.
- Create a new version of the profile where
- Output a valid version with no rules violated.
- Output a manifest file, listing which output file corresponds to which rule.
An example of a manifest file when violating one rule. "001.csv" is where the first rule has been violated.
[
{
"filepath": "001.csv",
"violatedRules": [ "Price field should not accept nulls" ]
}
]
The violation
constraint is an internal constraint and cannot be used specifically, only for violation. It works exactly like a not
constraint, except where dealing with and
constraints.
- A
not
constraint converts¬AND(X, Y, Z)
intoOR(¬X, ¬Y, ¬Z)
- A
violate
constraint convertsVIOLATE(AND(X, Y, Z))
into:
OR(
AND(VIOLATE(X), Y, Z),
AND(X, VIOLATE(Y), Z),
AND(X, Y, VIOLATE(Z)))
This is so that we end up with each inner constraint violated separately.
- The process would be expected to return vast quantities of data, as the single constraint
foo inSet [a, b, c]
when violated returns all data except [a, b, c] from the universal set. Whilst logically correct, could result in a unusable tool/data-set due to its time to create, or eventual size. - The process of violating constraints also violates the type for fields, e.g.
foo ofType string
will be negated tonot(foo ofType string)
. This itself could be useful for the user to test, but could also render the data unusable (e.g. if the consumer requires the 'schema' to be adhered to) - The process of violating constraints also violates the nullability for fields, e.g.
foo not(is null)
will be negated tofoo is null
. This itself could be useful for the user to test, but could render the data unusable (e.g. if the consumer requires non-null values for fieldfoo
). - Implied/default rules are not negated, therefore as every field is implied/defaulted to allowing nulls, the method of violation currently doesn't prevent null from being emitted when violating. This means that nulls can appear in both normal data generation mode AND violating data generation mode.