Skip to content

ykszk/stratified_group_kfold

Repository files navigation

Stratified Group K-fold

codecov

Split dataset into k folds with balanced label distribution (stratified) and non-overlapping groups.

StratifiedGroupKFold class is compatible with sklearn.model_selection.KFold

Reference : Stratified Group k-Fold Cross-Validation | Kaggle

Install

pip install git+https://github.com/yk-szk/stratified_group_kfold

Usage

from stratified_group_kfold import StratifiedGroupKFold


X, y, groups = load_dataset()

sgkf = StratifiedGroupKFold(n_splits=5, shuffle=True)
for train_index, test_index in sgkf.split(X, y, groups):
    do_stuff(train_index, test_index)

notebook example

About

⌚️An implementation of Stratified Group K-fold

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published