diff --git a/docs/userguide/constraints.md b/docs/userguide/constraints.md index 8102bc3a2..9123680a2 100644 --- a/docs/userguide/constraints.md +++ b/docs/userguide/constraints.md @@ -27,7 +27,7 @@ Not all surrogate models are able to treat continuous constraints. In such situa the constraints are currently silently ignored. ``` -### ```ContinuousLinearEqualityConstraint``` +### ``ContinuousLinearEqualityConstraint`` This linear constraint asserts that the following equation is true (up to numerical rounding errors): @@ -38,8 +38,8 @@ $$ where $x_i$ is the value of the $i$'th parameter affected by the constraint and $c_i$ is the coefficient for that parameter. $\text{rhs}$ is a user-chosen number. -As an example we assume we have three parameters named ```x_1```, ```x_2``` and -```x_3```, which describe the relative concentrations in a mixture campaign. +As an example we assume we have three parameters named ``x_1``, ``x_2`` and +``x_3``, which describe the relative concentrations in a mixture campaign. The constraint assuring that they always sum up to 1.0 would look like this: ```python from baybe.constraints import ContinuousLinearEqualityConstraint @@ -51,7 +51,7 @@ ContinuousLinearEqualityConstraint( ) ``` -### ```ContinuousLinearInequalityConstraint``` +### ``ContinuousLinearInequalityConstraint`` This linear constraint asserts that the following equation is true (up to numerical rounding errors): @@ -63,7 +63,7 @@ where $x_i$ is the value of the $i$'th parameter affected by the constraint, $c_i$ is the coefficient for that parameter. $\text{rhs}$ is a user-chosen number. ```{info} -You can specify a constraint involving ```<=``` instead of ```>=``` by multiplying +You can specify a constraint involving ``<=`` instead of ``>=`` by multiplying both sides, i.e. the coefficients and rhs, by -1. ``` @@ -90,9 +90,9 @@ always describes the relation of a single parameter to its possible values. It is through chaining several conditions in constraints that we can build complex logical expressions for them. -### ```ThresholdCondition``` +### ``ThresholdCondition`` For numerical parameters, we might want to select a certain range, which can be -achieved with ```ThresholdCondition```: +achieved with ``ThresholdCondition``: ```python from baybe.constraints import ThresholdCondition @@ -103,9 +103,9 @@ ThresholdCondition( # will select all values above 150 ) ``` -### ```SubSelectionCondition``` +### ``SubSelectionCondition`` In case a specific subset of values needs to be selected, it can be done with the -```SubSelectionCondition```: +``SubSelectionCondition``: ```python from baybe.constraints import SubSelectionCondition @@ -117,15 +117,15 @@ SubSelectionCondition( # will select two solvents identified by their labels ## Discrete Constraints Discrete constraints currently do not affect the optimization process directly. Instead, they act as a filter on the search space. -For instance, a search space created via ```from_product``` might include invalid +For instance, a search space created via ``from_product`` might include invalid combinations, which can be removed again by constraints. Discrete constraints have in common that they operate on one or more parameters, -identified by the ```parameters``` member, which expects a list of parameter names as +identified by the ``parameters`` member, which expects a list of parameter names as strings. All of these parameters must be present in the campaign specification. -### ```DiscreteExcludeConstraint``` +### ``DiscreteExcludeConstraint`` This constraint simply removes a set of search space entries, according to its specifications. @@ -145,10 +145,10 @@ DiscreteExcludeConstraint( ) ``` -### ```DiscreteSumConstraint``` and ```DiscreteProductConstraint``` +### ``DiscreteSumConstraint`` and ``DiscreteProductConstraint`` These constraints constrain sums or products of numerical parameters. In the example -from [```ContinuousLinearEqualityConstraint```](#continuouslinearequalityconstraint) we -had three continuous parameters ```x_1```, ```x_2``` and ```x_3``` which needed to sum +from [``ContinuousLinearEqualityConstraint``](#continuouslinearequalityconstraint) we +had three continuous parameters ``x_1``, ``x_2`` and ``x_3`` which needed to sum up to 1.0. If these parameters were instead discrete, the corresponding constraint would look like: ```python @@ -163,15 +163,15 @@ DiscreteSumConstraint( ) ``` -### ```DiscreteNoLabelDuplicatesConstraint``` +### ``DiscreteNoLabelDuplicatesConstraint`` Sometimes duplicated labels in several parameters are undesirable. Consider an example where we have two solvents which describe different mixture components. These might have the exact same or overlapping sets of possible values, e.g. -```["Water", "THF", "Octanol"]```. +``["Water", "THF", "Octanol"]``. It would not necessarily be reasonable to allow values in which both solvents show the same label/component. -We can exclude such occurrences with the ```DiscreteNoLabelDuplicatesConstraint```: +We can exclude such occurrences with the ``DiscreteNoLabelDuplicatesConstraint``: ```python from baybe.constraints import DiscreteNoLabelDuplicatesConstraint @@ -189,9 +189,9 @@ Without this constraint, combinations like below would be possible: | 2 | THF | Water | | | 3 | Octanol | Octanol | would be excluded | -### ```DiscreteLinkedParametersConstraint``` -The ```DiscreteLinkedParametersConstraint``` in a sense is the opposite of the -```DiscreteNoLabelDuplicatesConstraint```. +### ``DiscreteLinkedParametersConstraint`` +The ``DiscreteLinkedParametersConstraint`` in a sense is the opposite of the +``DiscreteNoLabelDuplicatesConstraint``. It will ensure that **only** entries with duplicated labels are present. This can be useful for instance in a situation where we have one parameter, but would like to include it with several encodings: @@ -221,13 +221,13 @@ DiscreteLinkedParametersConstraint( | 2 | THF | Water | would be excluded | | 3 | Octanol | Octanol | | -### ```DiscreteDependenciesConstraint``` +### ``DiscreteDependenciesConstraint`` Content coming soon... -### ```DiscretePermutationInvarianceConstraint``` +### ``DiscretePermutationInvarianceConstraint`` Content coming soon... -### ```DiscreteCustomConstraint``` +### ``DiscreteCustomConstraint`` With this constraint you can specify a completely custom filter: ```python @@ -253,7 +253,7 @@ DiscreteCustomConstraint( ```{warning} Due to the arbitrary nature of code and dependencies that can be used in the -```DiscreteCustomConstraint```, de-/serializability cannot be guaranteed. As a result, -using a ```DiscreteCustomConstraint``` results in an error if you attempt to serialize -the corresponding ```Campaign```. +``DiscreteCustomConstraint``, de-/serializability cannot be guaranteed. As a result, +using a ``DiscreteCustomConstraint`` results in an error if you attempt to serialize +the corresponding ``Campaign``. ``` diff --git a/docs/userguide/parameters.md b/docs/userguide/parameters.md index cc8b1368d..a1ed0ec00 100644 --- a/docs/userguide/parameters.md +++ b/docs/userguide/parameters.md @@ -1,12 +1,12 @@ # Parameters -Parameters are fundamental for BayBE, as they configure the ```SearchSpace``` and serve +Parameters are fundamental for BayBE, as they configure the ``SearchSpace`` and serve as the direct link to the controllable variables in your experiment. Before starting an iterative campaign, the user is required to specify the exact parameters they can control and want to consider in their optimization. ```{note} -BayBE identifies each parameter by a ```name```. All parameter names in one +BayBE identifies each parameter by a ``name``. All parameter names in one campaign must be unique. ``` @@ -15,10 +15,10 @@ two parameter types: Discrete and continuous parameters. ## Continuous Parameters -### ```NumericalContinuousParameter``` +### ``NumericalContinuousParameter`` This is currently the only continuous parameter BayBE supports. This parameters type defines possible values from a numerical interval called -```bounds```, and thus has an infinite amount of possibilities. +``bounds``, and thus has an infinite amount of possibilities. Unless restrained by constraints, BayBE will consider any possible parameter value that lies within the chosen interval. @@ -37,7 +37,7 @@ These values can be numeric or label-like and are transformed internally before ingested by the surrogate model. ```{note} -We call the process of transforming labels into numbers ```encoding```. +We call the process of transforming labels into numbers ``encoding``. To make labels usable in machine learning, we assign each label one or more numbers. While there are trivial ways of doing this, BayBE also provides methods to avoid problematic biases and even introduce useful information into the resulting latent @@ -45,12 +45,12 @@ number space. For different parameters different types of encoding make sense. T situations are reflected by the different discrete parameter types BayBE offers. ``` -### ```NumericalDiscreteParameter``` +### ``NumericalDiscreteParameter`` This is the right type for parameters that have numerical values. -We support sets with equidistant values like ```(1, 2, 3, 4, 5)``` but also unevenly -spaced sets of numbers like ```(0.2, 1.0, 2.0, 5.0, 10.0, 50.0)```. +We support sets with equidistant values like ``(1, 2, 3, 4, 5)`` but also unevenly +spaced sets of numbers like ``(0.2, 1.0, 2.0, 5.0, 10.0, 50.0)``. -This parameter also supports specifying a ```tolerance```. If specified, BayBE might +This parameter also supports specifying a ``tolerance``. If specified, BayBE might throw an error if measurements are added that are not within that specified tolerance from any of the possible values. @@ -63,16 +63,16 @@ NumericalDiscreteParameter( ) ``` -### ```CategoricalParameter``` -A ```CategoricalParameter``` supports sets of strings as labels. +### ``CategoricalParameter`` +A ``CategoricalParameter`` supports sets of strings as labels. This is most suitable if the experimental choices cannot easily be translated into a number. -Examples for this could be vendors like ```("Vendor A", "Vendor B", "Vendor C")``` or -post codes like ```("PO16 7GZ", "GU16 7HF", "L1 8JQ")```. +Examples for this could be vendors like ``("Vendor A", "Vendor B", "Vendor C")`` or +post codes like ``("PO16 7GZ", "GU16 7HF", "L1 8JQ")``. Categorical parameters in BayBE can be encoded via integer or one-hot encoding. For some cases this makes sense, e.g. if we had a parameter for a setting with values -```("low", "medium", "high")```, an integer-encoding into values ```(1, 2, 3)``` would +``("low", "medium", "high")``, an integer-encoding into values ``(1, 2, 3)`` would be reasonable. ```python @@ -88,21 +88,21 @@ CategoricalParameter( However, in some cases this kind of encoding introduces an unreasonable bias into the surrogate model. Take for instance a parameter for a choice of solvents with values -```("Solvent A", "Solvent B", "Solvent C")```. Encoding these with ```(1, 2, 3)``` as +``("Solvent A", "Solvent B", "Solvent C")``. Encoding these with ``(1, 2, 3)`` as above would imply that "Solvent A" is more similar to "Solvent B" than to "Solvent C" because the number 1 is closer to 2 than to 3. This implied ordering is however not generally the case for the provided labels. In general, it is not even possible to describe the similarity between labels by ordering across one single dimension. -For this reason we also provide the ```SubstanceParameter``` which encodes labels +For this reason we also provide the ``SubstanceParameter`` which encodes labels corresponding to small molecules with chemical descriptors, capturing their similarities much better and without the need for the user to think about ordering and similarity at all. -This concept is generalized in the ```CustomDiscreteParameter``` where the user can +This concept is generalized in the ``CustomDiscreteParameter`` where the user can provide their own custom set of descriptors for each label. -### ```SubstanceParameter``` -Instead of ```values```, this parameter accepts ```data``` in form of a dictionary. The +### ``SubstanceParameter`` +Instead of ``values``, this parameter accepts ``data`` in form of a dictionary. The items correspond to pairs of labels and [SMILES](https://en.wikipedia.org/wiki/Simplified_molecular-input_line-entry_system). SMILES are a string based representation of molecular structures. Based on these, BayBE can assign each label a set of molecular descriptors as encoding. @@ -124,39 +124,39 @@ SubstanceParameter( ) ``` -The ```encoding``` options define what kind of descriptors are calculated: -* ```MORDRED```: 2D descriptors from the [Mordred package](https://mordred-descriptor.github.io/documentation/master/) -* ```RDKIT```: 2D descriptors from the [RDKit package](https://www.rdkit.org/) -* ```MORGAN_FP```: Morgan fingerprints calculated with RDKit (1024 bits, radius 4) +The ``encoding`` options define what kind of descriptors are calculated: +* ``MORDRED``: 2D descriptors from the [Mordred package](https://mordred-descriptor.github.io/documentation/master/) +* ``RDKIT``: 2D descriptors from the [RDKit package](https://www.rdkit.org/) +* ``MORGAN_FP``: Morgan fingerprints calculated with RDKit (1024 bits, radius 4) These calculations will typically result in 500 to 1500 numbers per molecule. To avoid detrimental effects on the surrogate model fit we reduce the number of descriptors before using them via decorrelation. -The ```decorrelate``` option in the example above specifies that only descriptors that +The ``decorrelate`` option in the example above specifies that only descriptors that have a correlation lower than 0.7 to any other descriptor will be kept. This usually reduces the number of descriptors to 10-50, depending on the specific -items in ```data```. +items in ``data``. ```{warning} -The descriptors calculated for a ```SubstanceParameter``` were developed to describe +The descriptors calculated for a ``SubstanceParameter`` were developed to describe small molecules and are not suitable for other substances. If you deal with large molecules like polymers, or arbitrary substance mixtures, we recommend to provide your -own descriptors via the ```CustomParameter```. +own descriptors via the ``CustomParameter``. ``` ```{warning} -The ```SubstanceParameter``` is only available if BayBE was installed with the -additional ```chem``` dependency. +The ``SubstanceParameter`` is only available if BayBE was installed with the +additional ``chem`` dependency. ``` -### ```CustomDiscreteParameter``` -The ```encoding``` concept introduced above is generalized by the -```CustomParameter```. +### ``CustomDiscreteParameter`` +The ``encoding`` concept introduced above is generalized by the +``CustomParameter``. Here, the user is expected to provide their own descriptors for the encoding. Take for instance a parameter that corresponds to the choice of a polymer. Polymers are not well represented by the small molecule descriptors utilized in the -```SubstanceParameter```. +``SubstanceParameter``. But one could provide experimental measurements or common metrics used to classify polymers: @@ -178,15 +178,15 @@ CustomDiscreteParameter( ) ``` -With the ```CustomParameter``` you can also encode parameter labels that have nothing to do +With the ``CustomParameter`` you can also encode parameter labels that have nothing to do with substances. For example, a parameter corresponding to the choice of a vendor is typically not easily encoded with standard means. In BayBE's framework you can provide numbers corresponding e.g. to delivery time, reliability or average price of the vendor to encode the labels with these via the -```CustomParameter```. +``CustomParameter``. -### ```TaskParameter``` +### ``TaskParameter`` Often, several experimental campaigns involve similar or even identical parameters but still have one or more differences. For example, when optimizing reagents in a chemical reaction, the reactants remain