Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add options "min_samples" and "max_samples" #1086

Merged
merged 12 commits into from
Oct 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@ tocdepth: 2

A major update to the BOOMER algorithm that comes with the following changes.

```{warning}
This release comes with several API changes. For an updated overview of the available parameters and command line arguments, please refer to the [documentation](https://mlrl-boomer.readthedocs.io/en/0.11.0/).
```

### API Changes

- The options `min_samples` and `max_samples` have been added to the values of the command line arguments `--feature-sampling` and `--instance-sampling`.

### Quality-of-Life Improvements

- C++ 20 is now required for compiling the project.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,39 @@ class MLRLCOMMON_API IFeatureSamplingWithoutReplacementConfig {
*/
virtual IFeatureSamplingWithoutReplacementConfig& setSampleSize(float32 sampleSize) = 0;

/**
* Returns the minimum number of features that are included in a sample.
*
* @return The minimum number of features that are included in a sample
*/
virtual uint32 getMinSamples() const = 0;

/**
* Sets the minimum number of features that should be included in a sample.
*
* @param minSamples The minimum number of features that should be included in a sample. Must be at least 1
* @return A reference to an object of type `IFeatureSamplingWithoutReplacementConfig` that allows
* further configuration of the method for sampling features
*/
virtual IFeatureSamplingWithoutReplacementConfig& setMinSamples(uint32 minSamples) = 0;

/**
* Returns the maximum number of features that are included in a sample.
*
* @return The maximum number of features that are included in a sample
*/
virtual uint32 getMaxSamples() const = 0;

/**
* Sets the maximum number of features that should be included in a sample.
*
* @param maxSamples The maximum number of features that should be included in a sample. Must be at the value
* returned by `getMaxSamples` or 0, if the number of features should not be restricted
* @return A reference to an object of type `IFeatureSamplingWithoutReplacementConfig` that allows
* further configuration of the method for sampling features
*/
virtual IFeatureSamplingWithoutReplacementConfig& setMaxSamples(uint32 maxSamples) = 0;

/**
* Returns the number of trailing features that are always included in a sample.
*
Expand Down Expand Up @@ -63,6 +96,10 @@ class FeatureSamplingWithoutReplacementConfig final : public IFeatureSamplingCon

float32 sampleSize_;

uint32 minSamples_;

uint32 maxSamples_;

uint32 numRetained_;

public:
Expand All @@ -77,6 +114,14 @@ class FeatureSamplingWithoutReplacementConfig final : public IFeatureSamplingCon

IFeatureSamplingWithoutReplacementConfig& setSampleSize(float32 sampleSize) override;

uint32 getMinSamples() const override;

IFeatureSamplingWithoutReplacementConfig& setMinSamples(uint32 minSamples) override;

uint32 getMaxSamples() const override;

IFeatureSamplingWithoutReplacementConfig& setMaxSamples(uint32 maxSamples) override;

uint32 getNumRetained() const override;

IFeatureSamplingWithoutReplacementConfig& setNumRetained(uint32 numRetained) override;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,39 @@ class MLRLCOMMON_API IExampleWiseStratifiedInstanceSamplingConfig {
* allows further configuration of the method for sampling instances
*/
virtual IExampleWiseStratifiedInstanceSamplingConfig& setSampleSize(float32 sampleSize) = 0;

/**
* Returns the minimum number of examples that are included in a sample.
*
* @return The minimum number of examples that are included in a sample
*/
virtual uint32 getMinSamples() const = 0;

/**
* Sets the minimum number of examples that should be included in a sample.
*
* @param minSamples The minimum number of examples that should be included in a sample. Must be at least 1
* @return A reference to an object of type `IExampleWiseStratifiedInstanceSamplingConfig` that
* allows further configuration of the method for sampling instances
*/
virtual IExampleWiseStratifiedInstanceSamplingConfig& setMinSamples(uint32 minSamples) = 0;

/**
* Returns the maximum number of examples that are included in a sample.
*
* @return The maximum number of examples that are included in a sample
*/
virtual uint32 getMaxSamples() const = 0;

/**
* Sets the maximum number of examples that should be included in a sample.
*
* @param maxSamples The maximum number of examples that should be included in a sample. Must be at the value
* returned by `getMaxSamples` or 0, if the number of examples should not be restricted
* @return A reference to an object of type `IExampleWiseStratifiedInstanceSamplingConfig` that
* allows further configuration of the method for sampling instances
*/
virtual IExampleWiseStratifiedInstanceSamplingConfig& setMaxSamples(uint32 maxSamples) = 0;
};

/**
Expand All @@ -48,6 +81,10 @@ class ExampleWiseStratifiedInstanceSamplingConfig final : public IClassification

float32 sampleSize_;

uint32 minSamples_;

uint32 maxSamples_;

public:

/**
Expand All @@ -60,6 +97,14 @@ class ExampleWiseStratifiedInstanceSamplingConfig final : public IClassification

IExampleWiseStratifiedInstanceSamplingConfig& setSampleSize(float32 sampleSize) override;

uint32 getMinSamples() const override;

IExampleWiseStratifiedInstanceSamplingConfig& setMinSamples(uint32 minSamples) override;

uint32 getMaxSamples() const override;

IExampleWiseStratifiedInstanceSamplingConfig& setMaxSamples(uint32 maxSamples) override;

std::unique_ptr<IClassificationInstanceSamplingFactory> createClassificationInstanceSamplingFactory()
const override;
};
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,39 @@ class MLRLCOMMON_API IOutputWiseStratifiedInstanceSamplingConfig {
* allows further configuration of the method for sampling instances
*/
virtual IOutputWiseStratifiedInstanceSamplingConfig& setSampleSize(float32 sampleSize) = 0;

/**
* Returns the minimum number of examples that are included in a sample.
*
* @return The minimum number of examples that are included in a sample
*/
virtual uint32 getMinSamples() const = 0;

/**
* Sets the minimum number of examples that should be included in a sample.
*
* @param minSamples The minimum number of examples that should be included in a sample. Must be at least 1
* @return A reference to an object of type `IOutputWiseStratifiedInstanceSamplingConfig` that
* allows further configuration of the method for sampling instances
*/
virtual IOutputWiseStratifiedInstanceSamplingConfig& setMinSamples(uint32 minSamples) = 0;

/**
* Returns the maximum number of examples that are included in a sample.
*
* @return The maximum number of examples that are included in a sample
*/
virtual uint32 getMaxSamples() const = 0;

/**
* Sets the maximum number of examples that should be included in a sample.
*
* @param maxSamples The maximum number of examples that should be included in a sample. Must be at the value
* returned by `getMaxSamples` or 0, if the number of examples should not be restricted
* @return A reference to an object of type `IOutputWiseStratifiedInstanceSamplingConfig` that
* allows further configuration of the method for sampling instances
*/
virtual IOutputWiseStratifiedInstanceSamplingConfig& setMaxSamples(uint32 maxSamples) = 0;
};

/**
Expand All @@ -49,6 +82,10 @@ class OutputWiseStratifiedInstanceSamplingConfig final : public IClassificationI

float32 sampleSize_;

uint32 minSamples_;

uint32 maxSamples_;

public:

/**
Expand All @@ -61,6 +98,14 @@ class OutputWiseStratifiedInstanceSamplingConfig final : public IClassificationI

IOutputWiseStratifiedInstanceSamplingConfig& setSampleSize(float32 sampleSize) override;

uint32 getMinSamples() const override;

IOutputWiseStratifiedInstanceSamplingConfig& setMinSamples(uint32 minSamples) override;

uint32 getMaxSamples() const override;

IOutputWiseStratifiedInstanceSamplingConfig& setMaxSamples(uint32 maxSamples) override;

std::unique_ptr<IClassificationInstanceSamplingFactory> createClassificationInstanceSamplingFactory()
const override;
};
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,39 @@ class MLRLCOMMON_API IInstanceSamplingWithReplacementConfig {
* further configuration of the method for sampling instances
*/
virtual IInstanceSamplingWithReplacementConfig& setSampleSize(float32 sampleSize) = 0;

/**
* Returns the minimum number of examples that are included in a sample.
*
* @return The minimum number of examples that are included in a sample
*/
virtual uint32 getMinSamples() const = 0;

/**
* Sets the minimum number of examples that should be included in a sample.
*
* @param minSamples The minimum number of examples that should be included in a sample. Must be at least 1
* @return A reference to an object of type `IInstanceSamplingWithReplacementConfig` that allows
* further configuration of the method for sampling instances
*/
virtual IInstanceSamplingWithReplacementConfig& setMinSamples(uint32 minSamples) = 0;

/**
* Returns the maximum number of examples that are included in a sample.
*
* @return The maximum number of examples that are included in a sample
*/
virtual uint32 getMaxSamples() const = 0;

/**
* Sets the maximum number of examples that should be included in a sample.
*
* @param maxSamples The maximum number of examples that should be included in a sample. Must be at the value
* returned by `getMaxSamples` or 0, if the number of examples should not be restricted
* @return A reference to an object of type `IInstanceSamplingWithReplacementConfig` that allows
* further configuration of the method for sampling instances
*/
virtual IInstanceSamplingWithReplacementConfig& setMaxSamples(uint32 maxSamples) = 0;
};

/**
Expand All @@ -47,6 +80,10 @@ class InstanceSamplingWithReplacementConfig final : public IClassificationInstan

float32 sampleSize_;

uint32 minSamples_;

uint32 maxSamples_;

public:

/**
Expand All @@ -59,6 +96,14 @@ class InstanceSamplingWithReplacementConfig final : public IClassificationInstan

IInstanceSamplingWithReplacementConfig& setSampleSize(float32 sampleSize) override;

uint32 getMinSamples() const override;

IInstanceSamplingWithReplacementConfig& setMinSamples(uint32 minSamples) override;

uint32 getMaxSamples() const override;

IInstanceSamplingWithReplacementConfig& setMaxSamples(uint32 maxSamples) override;

std::unique_ptr<IClassificationInstanceSamplingFactory> createClassificationInstanceSamplingFactory()
const override;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,39 @@ class MLRLCOMMON_API IInstanceSamplingWithoutReplacementConfig {
* further configuration of the method for sampling instances
*/
virtual IInstanceSamplingWithoutReplacementConfig& setSampleSize(float32 sampleSize) = 0;

/**
* Returns the minimum number of examples that are included in a sample.
*
* @return The minimum number of examples that are included in a sample
*/
virtual uint32 getMinSamples() const = 0;

/**
* Sets the minimum number of examples that should be included in a sample.
*
* @param minSamples The minimum number of examples that should be included in a sample. Must be at least 1
* @return A reference to an object of type `IInstanceSamplingWithoutReplacementConfig` that allows
* further configuration of the method for sampling instances
*/
virtual IInstanceSamplingWithoutReplacementConfig& setMinSamples(uint32 minSamples) = 0;

/**
* Returns the maximum number of examples that are included in a sample.
*
* @return The maximum number of examples that are included in a sample
*/
virtual uint32 getMaxSamples() const = 0;

/**
* Sets the maximum number of examples that should be included in a sample.
*
* @param maxSamples The maximum number of examples that should be included in a sample. Must be at the value
* returned by `getMaxSamples` or 0, if the number of examples should not be restricted
* @return A reference to an object of type `IInstanceSamplingWithoutReplacementConfig` that allows
* further configuration of the method for sampling instances
*/
virtual IInstanceSamplingWithoutReplacementConfig& setMaxSamples(uint32 maxSamples) = 0;
};

/**
Expand All @@ -47,6 +80,10 @@ class InstanceSamplingWithoutReplacementConfig final : public IClassificationIns

float32 sampleSize_;

uint32 minSamples_;

uint32 maxSamples_;

public:

/**
Expand All @@ -59,6 +96,14 @@ class InstanceSamplingWithoutReplacementConfig final : public IClassificationIns

IInstanceSamplingWithoutReplacementConfig& setSampleSize(float32 sampleSize) override;

uint32 getMinSamples() const override;

IInstanceSamplingWithoutReplacementConfig& setMinSamples(uint32 minSamples) override;

uint32 getMaxSamples() const override;

IInstanceSamplingWithoutReplacementConfig& setMaxSamples(uint32 maxSamples) override;

std::unique_ptr<IClassificationInstanceSamplingFactory> createClassificationInstanceSamplingFactory()
const override;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,12 @@ class ExampleWiseStratification final {
*
* @param weightVector A reference to an object of type `BitWeightVector`, the weights should be written to
* @param sampleSize The fraction of the available examples to be selected
* @param minSamples The minimum number of examples to be included in the sample. Must be at least 1
* @param maxSamples The maximum number of examples to be included in the sample. Must be at least
* `minSamples` or 0, if the number of examples should not be restricted
*/
void sampleWeights(BitWeightVector& weightVector, float32 sampleSize) const;
void sampleWeights(BitWeightVector& weightVector, float32 sampleSize, uint32 minSamples,
uint32 maxSamples) const;

/**
* Randomly splits the available examples into two distinct sets and updates a given `BiPartition` accordingly.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,11 @@ class LabelWiseStratification final {
*
* @param weightVector A reference to an object of type `BitWeightVector`, the weights should be written to
* @param sampleSize The fraction of the available examples to be selected
* @param minSamples The minimum number of examples to be included in the sample. Must be at least 1
* @param maxSamples The maximum number of examples to be included in the sample. Must be at least
* `minSamples` or 0, if the number of examples should not be restricted
*/
void sampleWeights(BitWeightVector& weightVector, float32 sampleSize);
void sampleWeights(BitWeightVector& weightVector, float32 sampleSize, uint32 minSamples, uint32 maxSamples);

/**
* Randomly splits the available examples into two distinct sets and updates a given `BiPartition` accordingly.
Expand Down
Loading
Loading