replace NaN with zeros for several variables in a dataset

2 ビュー (過去 30 日間)
Vasquez
Vasquez 2014 年 7 月 25 日
コメント済み: Vasquez 2014 年 7 月 26 日
I have a large dataset with many variables containing NaN. I want to change all the NAN to 0 at once but I've not been able to do so. For instance, for A < 1000000 x 50 dataset> I've tried the following: >> x=find(isnan(A)); which does not work b/c 'isnan' is not defined for datasets
The following works but I have to do it variable by variable >> x=find(isnan(A.VarName)); >> A.VarName(x)=0; I also try to use a loop but have not succeeded.
Any suggestion on how to changes NAN for all dataset at once or for a set of numeric variables at once. Thanks Vasquez

採用された回答

Azzi Abdelmalek
Azzi Abdelmalek 2014 年 7 月 26 日
編集済み: Azzi Abdelmalek 2014 年 7 月 26 日
If A is your dataset
B=double(A);
B(isnan(B))=0;
replacedata(A,B)
Or
B=dataset2cell(A)
B(cellfun(@isnan,B))={0}
replacedata(A,B(2:end,:))
  1 件のコメント
Vasquez
Vasquez 2014 年 7 月 26 日
Thnks Azzi, your first option worked perfectly.

サインインしてコメントする。

その他の回答 (2 件)

Star Strider
Star Strider 2014 年 7 月 25 日
  2 件のコメント
Vasquez
Vasquez 2014 年 7 月 26 日
Using datasetfun still does not work, I’ve used as follow: B=datasetfun(@isnan,A) where A is my dataset. Error message said that isnan for type cell is not defined. Probably does not work because some of my variables are cell string. Thanks for the answer Star
Star Strider
Star Strider 2014 年 7 月 26 日
My pleasure!
You can also use dataset2cell and then cellfun. Changing the NaN values to zero can produce problems in statistical analyses, since zero can be considered valid data while NaN cannot, although I am certain you have considered this.

サインインしてコメントする。


Ahmet Cecen
Ahmet Cecen 2014 年 7 月 25 日
It is surprising that the isnan function is not working, I have been able to use it in similar situation without any problems. How about starting with a zeros matrix B (1000000x50) and pulling isfinite(A) into it, or ischar, or an applicable is function for your dataset. Quite memory intensive and a lazy way to do it. If you elaborate on what exactly your dataset contains, we might be able to suggest better alternatives.
B=zeros(1000000x50); B(isfinite(A))=A(isfinite(A));
  1 件のコメント
Vasquez
Vasquez 2014 年 7 月 26 日
Thanks for the suggestion Ahmet. Just a background I want to mention that my dataset A has been produced after changing my original dataset from long to wide, that is why several variables are full with NaNs. In the new wide data set, some variables are ‘cell strings’ with no NaNs, other variables are ‘nominal arrays, and the rest are ‘double’ containing large number of NaNs.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeData Type Identification についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by