Skip to content

Commit

Permalink
Merge pull request #1086 from mrapp-ke/new-sampling-options
Browse files Browse the repository at this point in the history
Add options "min_samples" and "max_samples"
  • Loading branch information
michael-rapp authored Oct 2, 2024
2 parents 7451324 + 38d1398 commit 07d5329
Show file tree
Hide file tree
Showing 31 changed files with 988 additions and 212 deletions.
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@ tocdepth: 2

A major update to the BOOMER algorithm that comes with the following changes.

```{warning}
This release comes with several API changes. For an updated overview of the available parameters and command line arguments, please refer to the [documentation](https://mlrl-boomer.readthedocs.io/en/0.11.0/).
```

### API Changes

- The options `min_samples` and `max_samples` have been added to the values of the command line arguments `--feature-sampling` and `--instance-sampling`.

### Quality-of-Life Improvements

- C++ 20 is now required for compiling the project.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,39 @@ class MLRLCOMMON_API IFeatureSamplingWithoutReplacementConfig {
*/
virtual IFeatureSamplingWithoutReplacementConfig& setSampleSize(float32 sampleSize) = 0;

/**
* Returns the minimum number of features that are included in a sample.
*
* @return The minimum number of features that are included in a sample
*/
virtual uint32 getMinSamples() const = 0;

/**
* Sets the minimum number of features that should be included in a sample.
*
* @param minSamples The minimum number of features that should be included in a sample. Must be at least 1
* @return A reference to an object of type `IFeatureSamplingWithoutReplacementConfig` that allows
* further configuration of the method for sampling features
*/
virtual IFeatureSamplingWithoutReplacementConfig& setMinSamples(uint32 minSamples) = 0;

/**
* Returns the maximum number of features that are included in a sample.
*
* @return The maximum number of features that are included in a sample
*/
virtual uint32 getMaxSamples() const = 0;

/**
* Sets the maximum number of features that should be included in a sample.
*
* @param maxSamples The maximum number of features that should be included in a sample. Must be at the value
* returned by `getMaxSamples` or 0, if the number of features should not be restricted
* @return A reference to an object of type `IFeatureSamplingWithoutReplacementConfig` that allows
* further configuration of the method for sampling features
*/
virtual IFeatureSamplingWithoutReplacementConfig& setMaxSamples(uint32 maxSamples) = 0;

/**
* Returns the number of trailing features that are always included in a sample.
*
Expand Down Expand Up @@ -63,6 +96,10 @@ class FeatureSamplingWithoutReplacementConfig final : public IFeatureSamplingCon

float32 sampleSize_;

uint32 minSamples_;

uint32 maxSamples_;

uint32 numRetained_;

public:
Expand All @@ -77,6 +114,14 @@ class FeatureSamplingWithoutReplacementConfig final : public IFeatureSamplingCon

IFeatureSamplingWithoutReplacementConfig& setSampleSize(float32 sampleSize) override;

uint32 getMinSamples() const override;

IFeatureSamplingWithoutReplacementConfig& setMinSamples(uint32 minSamples) override;

uint32 getMaxSamples() const override;

IFeatureSamplingWithoutReplacementConfig& setMaxSamples(uint32 maxSamples) override;

uint32 getNumRetained() const override;

IFeatureSamplingWithoutReplacementConfig& setNumRetained(uint32 numRetained) override;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,39 @@ class MLRLCOMMON_API IExampleWiseStratifiedInstanceSamplingConfig {
* allows further configuration of the method for sampling instances
*/
virtual IExampleWiseStratifiedInstanceSamplingConfig& setSampleSize(float32 sampleSize) = 0;

/**
* Returns the minimum number of examples that are included in a sample.
*
* @return The minimum number of examples that are included in a sample
*/
virtual uint32 getMinSamples() const = 0;

/**
* Sets the minimum number of examples that should be included in a sample.
*
* @param minSamples The minimum number of examples that should be included in a sample. Must be at least 1
* @return A reference to an object of type `IExampleWiseStratifiedInstanceSamplingConfig` that
* allows further configuration of the method for sampling instances
*/
virtual IExampleWiseStratifiedInstanceSamplingConfig& setMinSamples(uint32 minSamples) = 0;

/**
* Returns the maximum number of examples that are included in a sample.
*
* @return The maximum number of examples that are included in a sample
*/
virtual uint32 getMaxSamples() const = 0;

/**
* Sets the maximum number of examples that should be included in a sample.
*
* @param maxSamples The maximum number of examples that should be included in a sample. Must be at the value
* returned by `getMaxSamples` or 0, if the number of examples should not be restricted
* @return A reference to an object of type `IExampleWiseStratifiedInstanceSamplingConfig` that
* allows further configuration of the method for sampling instances
*/
virtual IExampleWiseStratifiedInstanceSamplingConfig& setMaxSamples(uint32 maxSamples) = 0;
};

/**
Expand All @@ -48,6 +81,10 @@ class ExampleWiseStratifiedInstanceSamplingConfig final : public IClassification

float32 sampleSize_;

uint32 minSamples_;

uint32 maxSamples_;

public:

/**
Expand All @@ -60,6 +97,14 @@ class ExampleWiseStratifiedInstanceSamplingConfig final : public IClassification

IExampleWiseStratifiedInstanceSamplingConfig& setSampleSize(float32 sampleSize) override;

uint32 getMinSamples() const override;

IExampleWiseStratifiedInstanceSamplingConfig& setMinSamples(uint32 minSamples) override;

uint32 getMaxSamples() const override;

IExampleWiseStratifiedInstanceSamplingConfig& setMaxSamples(uint32 maxSamples) override;

std::unique_ptr<IClassificationInstanceSamplingFactory> createClassificationInstanceSamplingFactory()
const override;
};
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,39 @@ class MLRLCOMMON_API IOutputWiseStratifiedInstanceSamplingConfig {
* allows further configuration of the method for sampling instances
*/
virtual IOutputWiseStratifiedInstanceSamplingConfig& setSampleSize(float32 sampleSize) = 0;

/**
* Returns the minimum number of examples that are included in a sample.
*
* @return The minimum number of examples that are included in a sample
*/
virtual uint32 getMinSamples() const = 0;

/**
* Sets the minimum number of examples that should be included in a sample.
*
* @param minSamples The minimum number of examples that should be included in a sample. Must be at least 1
* @return A reference to an object of type `IOutputWiseStratifiedInstanceSamplingConfig` that
* allows further configuration of the method for sampling instances
*/
virtual IOutputWiseStratifiedInstanceSamplingConfig& setMinSamples(uint32 minSamples) = 0;

/**
* Returns the maximum number of examples that are included in a sample.
*
* @return The maximum number of examples that are included in a sample
*/
virtual uint32 getMaxSamples() const = 0;

/**
* Sets the maximum number of examples that should be included in a sample.
*
* @param maxSamples The maximum number of examples that should be included in a sample. Must be at the value
* returned by `getMaxSamples` or 0, if the number of examples should not be restricted
* @return A reference to an object of type `IOutputWiseStratifiedInstanceSamplingConfig` that
* allows further configuration of the method for sampling instances
*/
virtual IOutputWiseStratifiedInstanceSamplingConfig& setMaxSamples(uint32 maxSamples) = 0;
};

/**
Expand All @@ -49,6 +82,10 @@ class OutputWiseStratifiedInstanceSamplingConfig final : public IClassificationI

float32 sampleSize_;

uint32 minSamples_;

uint32 maxSamples_;

public:

/**
Expand All @@ -61,6 +98,14 @@ class OutputWiseStratifiedInstanceSamplingConfig final : public IClassificationI

IOutputWiseStratifiedInstanceSamplingConfig& setSampleSize(float32 sampleSize) override;

uint32 getMinSamples() const override;

IOutputWiseStratifiedInstanceSamplingConfig& setMinSamples(uint32 minSamples) override;

uint32 getMaxSamples() const override;

IOutputWiseStratifiedInstanceSamplingConfig& setMaxSamples(uint32 maxSamples) override;

std::unique_ptr<IClassificationInstanceSamplingFactory> createClassificationInstanceSamplingFactory()
const override;
};
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,39 @@ class MLRLCOMMON_API IInstanceSamplingWithReplacementConfig {
* further configuration of the method for sampling instances
*/
virtual IInstanceSamplingWithReplacementConfig& setSampleSize(float32 sampleSize) = 0;

/**
* Returns the minimum number of examples that are included in a sample.
*
* @return The minimum number of examples that are included in a sample
*/
virtual uint32 getMinSamples() const = 0;

/**
* Sets the minimum number of examples that should be included in a sample.
*
* @param minSamples The minimum number of examples that should be included in a sample. Must be at least 1
* @return A reference to an object of type `IInstanceSamplingWithReplacementConfig` that allows
* further configuration of the method for sampling instances
*/
virtual IInstanceSamplingWithReplacementConfig& setMinSamples(uint32 minSamples) = 0;

/**
* Returns the maximum number of examples that are included in a sample.
*
* @return The maximum number of examples that are included in a sample
*/
virtual uint32 getMaxSamples() const = 0;

/**
* Sets the maximum number of examples that should be included in a sample.
*
* @param maxSamples The maximum number of examples that should be included in a sample. Must be at the value
* returned by `getMaxSamples` or 0, if the number of examples should not be restricted
* @return A reference to an object of type `IInstanceSamplingWithReplacementConfig` that allows
* further configuration of the method for sampling instances
*/
virtual IInstanceSamplingWithReplacementConfig& setMaxSamples(uint32 maxSamples) = 0;
};

/**
Expand All @@ -47,6 +80,10 @@ class InstanceSamplingWithReplacementConfig final : public IClassificationInstan

float32 sampleSize_;

uint32 minSamples_;

uint32 maxSamples_;

public:

/**
Expand All @@ -59,6 +96,14 @@ class InstanceSamplingWithReplacementConfig final : public IClassificationInstan

IInstanceSamplingWithReplacementConfig& setSampleSize(float32 sampleSize) override;

uint32 getMinSamples() const override;

IInstanceSamplingWithReplacementConfig& setMinSamples(uint32 minSamples) override;

uint32 getMaxSamples() const override;

IInstanceSamplingWithReplacementConfig& setMaxSamples(uint32 maxSamples) override;

std::unique_ptr<IClassificationInstanceSamplingFactory> createClassificationInstanceSamplingFactory()
const override;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,39 @@ class MLRLCOMMON_API IInstanceSamplingWithoutReplacementConfig {
* further configuration of the method for sampling instances
*/
virtual IInstanceSamplingWithoutReplacementConfig& setSampleSize(float32 sampleSize) = 0;

/**
* Returns the minimum number of examples that are included in a sample.
*
* @return The minimum number of examples that are included in a sample
*/
virtual uint32 getMinSamples() const = 0;

/**
* Sets the minimum number of examples that should be included in a sample.
*
* @param minSamples The minimum number of examples that should be included in a sample. Must be at least 1
* @return A reference to an object of type `IInstanceSamplingWithoutReplacementConfig` that allows
* further configuration of the method for sampling instances
*/
virtual IInstanceSamplingWithoutReplacementConfig& setMinSamples(uint32 minSamples) = 0;

/**
* Returns the maximum number of examples that are included in a sample.
*
* @return The maximum number of examples that are included in a sample
*/
virtual uint32 getMaxSamples() const = 0;

/**
* Sets the maximum number of examples that should be included in a sample.
*
* @param maxSamples The maximum number of examples that should be included in a sample. Must be at the value
* returned by `getMaxSamples` or 0, if the number of examples should not be restricted
* @return A reference to an object of type `IInstanceSamplingWithoutReplacementConfig` that allows
* further configuration of the method for sampling instances
*/
virtual IInstanceSamplingWithoutReplacementConfig& setMaxSamples(uint32 maxSamples) = 0;
};

/**
Expand All @@ -47,6 +80,10 @@ class InstanceSamplingWithoutReplacementConfig final : public IClassificationIns

float32 sampleSize_;

uint32 minSamples_;

uint32 maxSamples_;

public:

/**
Expand All @@ -59,6 +96,14 @@ class InstanceSamplingWithoutReplacementConfig final : public IClassificationIns

IInstanceSamplingWithoutReplacementConfig& setSampleSize(float32 sampleSize) override;

uint32 getMinSamples() const override;

IInstanceSamplingWithoutReplacementConfig& setMinSamples(uint32 minSamples) override;

uint32 getMaxSamples() const override;

IInstanceSamplingWithoutReplacementConfig& setMaxSamples(uint32 maxSamples) override;

std::unique_ptr<IClassificationInstanceSamplingFactory> createClassificationInstanceSamplingFactory()
const override;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,12 @@ class ExampleWiseStratification final {
*
* @param weightVector A reference to an object of type `BitWeightVector`, the weights should be written to
* @param sampleSize The fraction of the available examples to be selected
* @param minSamples The minimum number of examples to be included in the sample. Must be at least 1
* @param maxSamples The maximum number of examples to be included in the sample. Must be at least
* `minSamples` or 0, if the number of examples should not be restricted
*/
void sampleWeights(BitWeightVector& weightVector, float32 sampleSize) const;
void sampleWeights(BitWeightVector& weightVector, float32 sampleSize, uint32 minSamples,
uint32 maxSamples) const;

/**
* Randomly splits the available examples into two distinct sets and updates a given `BiPartition` accordingly.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,11 @@ class LabelWiseStratification final {
*
* @param weightVector A reference to an object of type `BitWeightVector`, the weights should be written to
* @param sampleSize The fraction of the available examples to be selected
* @param minSamples The minimum number of examples to be included in the sample. Must be at least 1
* @param maxSamples The maximum number of examples to be included in the sample. Must be at least
* `minSamples` or 0, if the number of examples should not be restricted
*/
void sampleWeights(BitWeightVector& weightVector, float32 sampleSize);
void sampleWeights(BitWeightVector& weightVector, float32 sampleSize, uint32 minSamples, uint32 maxSamples);

/**
* Randomly splits the available examples into two distinct sets and updates a given `BiPartition` accordingly.
Expand Down
Loading

0 comments on commit 07d5329

Please sign in to comment.