How can I do crossvalidation and oversampling with an imbalanced dataset?

5 ビュー (過去 30 日間)
Stef
Stef 2018 年 7 月 14 日
回答済み: Kenta 2020 年 7 月 11 日
I have an imbalanced dataset, with very few observations belonging to category 1 and a lot belonging to category 0. Therefore I want to oversample the smaller class 1. However, then I have to be careful when doing the crossvalidation that the same observation in category 1 is not included in both sets. Does anybody know how to code up the crossvalidation?
X_train = [1 2 3 2 4 5];
y_train = [0 0 0 0 1 1];
X_test = [2 4 1];
y_test = [0 1 0];
What I would do now is to oversample the observations with category 1:
X_train = [1 2 3 2 4 5 4 5];
y_train = [0 0 0 0 1 1 1 1];
Could anybody please help me with the crossvalidation when oversampling?

採用された回答

Kenta
Kenta 2020 年 7 月 11 日

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeProgramming についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by