How can I do crossvalidation and oversampling with an imbalanced dataset?

1 回表示 (過去 30 日間)
Stef
Stef 2018 年 7 月 14 日
回答済み: Kenta 2020 年 7 月 11 日
I have an imbalanced dataset, with very few observations belonging to category 1 and a lot belonging to category 0. Therefore I want to oversample the smaller class 1. However, then I have to be careful when doing the crossvalidation that the same observation in category 1 is not included in both sets. Does anybody know how to code up the crossvalidation?
X_train = [1 2 3 2 4 5];
y_train = [0 0 0 0 1 1];
X_test = [2 4 1];
y_test = [0 1 0];
What I would do now is to oversample the observations with category 1:
X_train = [1 2 3 2 4 5 4 5];
y_train = [0 0 0 0 1 1 1 1];
Could anybody please help me with the crossvalidation when oversampling?

採用された回答

Kenta
Kenta 2020 年 7 月 11 日

その他の回答 (0 件)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by