[Feature] Outcome aware train-test split #396

egillax · 2023-06-15T09:24:29Z

According to this paper, over a certain number of outcome events the discriminative model performance stops improving for L1 logistic regression at least.

This could be taken advantage of when the data size is very big to limit the training set to that number (or slightly above to be safe) and move the rest of the data to the test set. splitData takes the population as an input. So should be relatively easy to adjust the splits based on # of outcome events.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Outcome aware train-test split #396

[Feature] Outcome aware train-test split #396

egillax commented Jun 15, 2023

[Feature] Outcome aware train-test split #396

[Feature] Outcome aware train-test split #396

Comments

egillax commented Jun 15, 2023