How to replace multiple colums with NaN

Hi, i have a large dataset where each column is a subject. I have to remove some of them from the analisys but i have to keep the old subject numbers, so i can't simply remove the columns. The first idea i had was to fill all the bad columns with NaN, but doing this manually is very time consuming. Is there a fast way to do this? Like giving an array of column numbers and then let MATLAB do the job based on those numbers?

8 件のコメント

Dyuman Joshi
Dyuman Joshi 2022 年 5 月 25 日
Is there a criteria for deciding which columns are good ones and/or which are the bad ones?
Andrea Carobbi
Andrea Carobbi 2022 年 5 月 25 日
編集済み: Andrea Carobbi 2022 年 5 月 25 日
Unfortunately not, the selection was done visually on some plots.
Dyuman Joshi
Dyuman Joshi 2022 年 5 月 25 日
Visually as in?
dpb
dpb 2022 年 5 月 25 日
Well, you've got to have the columns defined somehow, whether it's done programmatically or by hand -- it would help to know how the data are stored -- are you using a table or just arrays or what? Given you mention keeping a subject number, I'm guessing maybe it's a table? There are addressing modes by either column index or by the table variable name, whichever is more convenient.
But, yes, given an index array of desired columns it's trivial to write
ixBad=[1, 23, 4]; % the list of columns, however obtained
X(:,ixBad)=missing; % set those columns to missing/bad indicator
Using missing will handle a case in a table where there may be different data types; it will match the inserted indicator to the type of the data it is replacing; isn't significant if it is just a double array...
See the doc section on addressing tables to see how to use column variable names instead of indices for a table if that's more convenient and using tables.
Benjamin Thompson
Benjamin Thompson 2022 年 5 月 25 日
Posting a small subset or example of your data will probably help the Community most in providing a good answer quickly.
Andrea Carobbi
Andrea Carobbi 2022 年 5 月 26 日
dpb, thanks you. Your code does exactly what i wanted, i dind't know MATLAB has this function.
To answer the question, the data i'm analysing are from some measurement on human eyes. The values in each of the matrix's colums should rise while the stimulus given to the subject rises. I made some plot of the mean progression of the response and by hand wrote down the subject numbers (that is, the column number) who did not show any progression. Now i have to remove this subject and repeat the analisys, keeping the same numeration (the subject 40 must remain 40). The easiest way i could think of was replacing all the bad data with NaN.
dpb
dpb 2022 年 5 月 26 日
" i dind't know MATLAB has this function."
Read through the "Getting Started" documentation/examples to get an idea about how MATLAB works. Vector operations are key to using it effectively and addressing arrays is key element in doing that.
"...made some plot of the mean progression of the response and by hand wrote down the subject numbers (that is, the column number)"
You could probably write code that does that screening pretty-much automagically as well -- simply testing the slopes are significant above zero would not be too difficult an exercise.
I would recomend again using the MATLAB table class and keeping the subject name as the column ID -- it comes along "for free" and isn't confused with the data being numeric.
Andrea Carobbi
Andrea Carobbi 2022 年 5 月 27 日
The main thing i missed is the fact i can use an array to pass multiple indices, even by reading the getting started guide. Thank you for remainding me.
I honestly didn't think about making the check automatic, 'cause for now the dataset isn't extremely big, so doing it by hand is still viable, but i will if the dimension of the dataset become too big.

サインインしてコメントする。

回答 (0 件)

カテゴリ

製品

リリース

R2021b

タグ

質問済み:

2022 年 5 月 25 日

コメント済み:

2022 年 5 月 27 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by