# How to fill in NaNs or <undefined> in data with the mode of each column

15 ビュー (過去 30 日間)
Dhruv Ghulati 2015 年 12 月 21 日
コメント済み: jgg 2015 年 12 月 22 日
I have converted a mixed table of both categorical and double arrays into being all columns of type double, via making each category in the categorical arrays a double.
I have a table of 40k rows, and 40 columns. I want to fill in NaNs via replacing each NaN value with the mode value for that column.
I found a clear looping method in R via this link , but couldn't find a simple loop in matlab to do it. inpaint_nans seems to be more focused on interpolation of the data.
knnimpute()
also fails because I can have swathes of up to 1000 rows which are all NaNs (so I need 1200+ neighbours), as well as 40+ columns, so the algorithm has to loop through 40! times which is very slow.
Any ideas?

サインインしてコメントする。

### 回答 (1 件)

jgg 2015 年 12 月 22 日

Select the NaNs and set them to things:
A = [1 2 NaN 4 5; 1 2 3 NaN 5; 1 NaN NaN NaN 5];
m = mode(A,1);
m = repmat(m,size(A,2), 1);
A_f = A;
A_f(isnan(A)) = m(isnan(A));
Looping is not necessary if you use vectorized operations.
Note: if your matrix is very large, the repmat step can be replaced with a for loop over the columns in order to use less memory, but 40k by 40 is not that large, so it should be fine.
##### 2 件のコメント表示非表示 1 件の古いコメント
jgg 2015 年 12 月 22 日
If you liked this answer, please accept it so other people can see it resolved your problem!

サインインしてコメントする。

### カテゴリ

Find more on Call Python from MATLAB in Help Center and File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by