Why are some csv files imported incorrectly into my cell array?

Hi,
I have a cell array called alldata whcih contains the contents of 24 csv files. However, when importing these files I can see that the last five (for example the csv file: 5422_task.csv) have been incorrectly imported in that the first column inlcudes two values (seperated by a comma) with an apostrophe infront.
alldata{1, 24}
ans =
1216×3 cell array
{'media_open; media_play; medi…'} {'2022/09/23 15:06:18:984'} {' 2022/09/23 15:11:37:652"'}
{'Multimedia File,"task_com.Ut…'} {1×1 missing } {1×1 missing }
{'Lower Label,"Weak Presence"' } {1×1 missing } {1×1 missing }
{'Upper Label,"Strong Presence"' } {1×1 missing } {1×1 missing }
{'Minimum Value,-100' } {1×1 missing } {1×1 missing }
{'Maximum Value,100' } {1×1 missing } {1×1 missing }
{'Number of Steps,9' } {1×1 missing } {1×1 missing }
{'Second,"Rating"' } {1×1 missing } {1×1 missing }
{'%%%%%%,"%%%%%%"' } {1×1 missing } {1×1 missing }
{'10.5,96.09' } {1×1 missing } {1×1 missing }
{'10.75,96.09' } {1×1 missing } {1×1 missing }
{'11,96.09' } {1×1 missing } {1×1 missing }
{'11.25,96.16375' } {1×1 missing } {1×1 missing }
{'11.5,96.45875' } {1×1 missing } {1×1 missing }
On the other hand, all the other csv files have been correctly imported so that the first two columns show two different values that have been seperated by a comma (for example the csv file: 1311_task.csv).
alldata{1, 1}
ans =
682×3 cell array
{'media_open; media_play; medi…'} {'2022/09/19 14:42:27:371' } {' 2022/09/19 14:54:07:167"'}
{'Multimedia File' } {'com.UtrechtUniversity.XRPS_Q…'} {1×1 missing }
{'Lower Label' } {'Negative Affect' } {1×1 missing }
{'Upper Label' } {'Positive Affect' } {1×1 missing }
{'Minimum Value' } {[ -100]} {1×1 missing }
{'Maximum Value' } {[ 100]} {1×1 missing }
{'Number of Steps' } {[ 9]} {1×1 missing }
{'Second' } {'Rating' } {1×1 missing }
{'%%%%%%' } {'%%%%%%' } {1×1 missing }
{[ 1]} {[ 0.7800]} {1×1 missing }
{[ 2]} {[ 0.8975]} {1×1 missing }
{[ 3]} {[ 0.7800]} {1×1 missing }
{[ 4]} {[ 0.7800]} {1×1 missing }
{[ 5]} {[ 0.8385]} {1×1 missing }
{[ 6]} {[ 0.7800]} {1×1 missing }
{[ 7]} {[ 0.7800]} {1×1 missing }
Any idea why this might be the case?
Thank you!

 採用された回答

Voss
Voss 2022 年 12 月 13 日
"Any idea why this might be the case?"
It's because the different files have commas and semicolons in different places, e.g. line 10 of 1311_task.csv looks like this:
1;0.78;
but line 10 of 5422_task.csv looks like this:
10.5,96.09;;
So in one file you've got a semicolon after each number, and in the other file a comma in between the numbers and two semicolons at the end of the line.
I don't know what function(s) you're using to import the files, but here's an attempt to handle both of those situations with one piece of code:
files = {'1311_task.csv' '5422_task.csv'};
C = cell(1,numel(files));
for ii = 1:numel(files)
C{ii} = readcell(files{ii},'Delimiter',{',' ';'},'ConsecutiveDelimitersRule','join');
end
C{:}
ans = 682×3 cell array
{'media_open; media_play; media_end'} {'2022/09/19 14:42:27:371' } {' 2022/09/19 14:54:07:167"'} {'Multimedia File' } {'com.UtrechtUniversity.XRPS_Quest-20220919-135434.mkv'} {1×1 missing } {'Lower Label' } {'Negative Affect' } {1×1 missing } {'Upper Label' } {'Positive Affect' } {1×1 missing } {'Minimum Value' } {[ -100]} {1×1 missing } {'Maximum Value' } {[ 100]} {1×1 missing } {'Number of Steps' } {[ 9]} {1×1 missing } {'Second' } {'Rating' } {1×1 missing } {'%%%%%%' } {'%%%%%%' } {1×1 missing } {[ 1]} {[ 0.7800]} {1×1 missing } {[ 2]} {[ 0.8975]} {1×1 missing } {[ 3]} {[ 0.7800]} {1×1 missing } {[ 4]} {[ 0.7800]} {1×1 missing } {[ 5]} {[ 0.8385]} {1×1 missing } {[ 6]} {[ 0.7800]} {1×1 missing } {[ 7]} {[ 0.7800]} {1×1 missing }
ans = 1216×3 cell array
{'media_open; media_play; media_end,"2022/09/23 15:06:11:215' } {'2022/09/23 15:06:18:984'} {' 2022/09/23 15:11:37:652"'} {'Multimedia File,"task_com.UtrechtUniversity.XRPS_Quest-20220923-142855.mkv"'} {1×1 missing } {1×1 missing } {'Lower Label,"Weak Presence"' } {1×1 missing } {1×1 missing } {'Upper Label,"Strong Presence"' } {1×1 missing } {1×1 missing } {'Minimum Value' } {[ -100]} {1×1 missing } {'Maximum Value' } {[ 100]} {1×1 missing } {'Number of Steps' } {[ 9]} {1×1 missing } {'Second,"Rating"' } {1×1 missing } {1×1 missing } {'%%%%%%,"%%%%%%"' } {1×1 missing } {1×1 missing } {[ 10.5000]} {[ 96.0900]} {1×1 missing } {[ 10.7500]} {[ 96.0900]} {1×1 missing } {[ 11]} {[ 96.0900]} {1×1 missing } {[ 11.2500]} {[ 96.1637]} {1×1 missing } {[ 11.5000]} {[ 96.4587]} {1×1 missing } {[ 11.7500]} {[ 96.0900]} {1×1 missing } {[ 12]} {[ 96.0900]} {1×1 missing }
As you can see there, the header info (lines 1-9) is not parsed the same between the two files, but the data section (lines 10-end) is, so maybe that's good enough?

5 件のコメント

lil brain
lil brain 2022 年 12 月 14 日
編集済み: lil brain 2022 年 12 月 14 日
Hi @Voss thanks once again for helping out!
I actually use uigetfile to select multiple files in a folder like this:
[file_list, path_n] = uigetfile('.csv', 'Grab csv', 'Multiselect', 'on');
if ~iscell(file_list)
file_list = {file_list};
end
% rename file list
valid_high_IDs = file_list;
%remove the csv from the name
valid_high_IDs = strrep(strrep(valid_high_IDs,' - edited',''),'_task.csv','');
for i = 1:length(file_list)
filename = file_list{i};
data_in = readcell([path_n, filename]);
alldata{1,i} = data_in;
end
How would I need to change your code above if I want to import multiple files using uigetfile?
Thanks!
Stephen23
Stephen23 2022 年 12 月 14 日
編集済み: Stephen23 2022 年 12 月 14 日
[F,P] = uigetfile('*.csv', 'Grab csv', 'Multiselect','on');
F = cellstr(F); % simpler than IF and {}
N = numel(F);
C = cell(1,N);
for ii = 1:N
G = fullfile(P,F{ii});
C{ii} = readcell(G, 'Delimiter',{',',';'}, 'ConsecutiveDelimitersRule','join');
end
lil brain
lil brain 2022 年 12 月 14 日
Hi @Stephen23 thanks!
However,I get the error:
Error using readcell
Unable to find or open '1311_task.csv'. Check the path and filename or file permissions.
Error in CRM_analysis (line 7)
C{ii} = readcell(F{ii}, 'Delimiter',{',',';'}, 'ConsecutiveDelimitersRule','join');
Why is that?
lil brain
lil brain 2022 年 12 月 14 日
It seems that this error appears no matter what files I select. It is always the first file in the list though.
Stephen23
Stephen23 2022 年 12 月 14 日
"Why is that?"
Forgot the path, see fixed code.

サインインしてコメントする。

その他の回答 (0 件)

質問済み:

2022 年 12 月 13 日

編集済み:

2022 年 12 月 14 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by