Find a specific value in a csv file

How can I make it?
In csv file, I want to find the year in Part 1 and D value ("D":"1") in Part2
Finally I want to make :
2022, 1
2021, 2
2020, 3
2019, 4
2018, 5

回答 (2 件)

Voss
Voss 2022 年 7 月 15 日

0 投票

Maybe this will work on your file (if not, upload the file here)
file_name = 'table_data.csv';
C = readcell(file_name,'Delimiter',',','NumHeaderLines',1);
years = regexp(C(:,1),'(\d{4})','tokens','once');
years = vertcat(years{:})
years = 5×1 cell array
{'2022'} {'2021'} {'2020'} {'2019'} {'2018'}
d = regexp(C(:,end),'D:"(\d+)"','tokens','once');
d = vertcat(d{:})
d = 5×1 cell array
{'1'} {'2'} {'3'} {'4'} {'5'}
% result as a cell array of character vectors:
result = [years d]
result = 5×2 cell array
{'2022'} {'1'} {'2021'} {'2'} {'2020'} {'3'} {'2019'} {'4'} {'2018'} {'5'}
% result as a numeric matrix:
result = str2double([years d])
result = 5×2
2022 1 2021 2 2020 3 2019 4 2018 5

9 件のコメント

Alexai
Alexai 2022 年 7 月 16 日
I editted your file(table_data.csv)
In this file, I want to I want to make it
2022, 1
2021, 2
2020, 3
2019, 4
2018, 5
Voss
Voss 2022 年 7 月 16 日
file_name = 'table_data.csv';
C = readcell(file_name,'Delimiter','\t','NumHeaderLines',1)
C = 5×2 cell array
{'A/1/2/AB0000/2022abc'} {'"B":"1", "C":"2", "D":"1"'} {'A/1/2/AB0000/2021abc'} {'"B":"1", "C":"2", "D":"2"'} {'A/1/2/AB0000/2020abc'} {'"B":"1", "C":"2", "D":"3"'} {'A/1/2/AB0000/2019abc'} {'"B":"1", "C":"2", "D":"4"'} {'A/1/2/AB0000/2018abc'} {'"B":"1", "C":"2", "D":"5"'}
years = regexp(C(:,1),'/(\d{4})','tokens','once');
years = vertcat(years{:})
years = 5×1 cell array
{'2022'} {'2021'} {'2020'} {'2019'} {'2018'}
d = regexp(C(:,end),'"D":"(\d+)"','tokens','once');
d = vertcat(d{:})
d = 5×1 cell array
{'1'} {'2'} {'3'} {'4'} {'5'}
% result as a cell array of character vectors:
result = [years d]
result = 5×2 cell array
{'2022'} {'1'} {'2021'} {'2'} {'2020'} {'3'} {'2019'} {'4'} {'2018'} {'5'}
% or, result as a numeric matrix:
result = str2double([years d])
result = 5×2
2022 1 2021 2 2020 3 2019 4 2018 5
Alexai
Alexai 2022 年 7 月 16 日
編集済み: Alexai 2022 年 7 月 16 日
how can I read next tab in this code?
years = regexp(C(:,1),'/(\d{4})','tokens','once');
only add 2 slashes'//'?
Voss
Voss 2022 年 7 月 16 日
I'm not sure what you mean by "read next tab", because the file is read using tab as the delimiter:
file_name = 'table_data.csv';
C = readcell(file_name,'Delimiter','\t','NumHeaderLines',1)
so there are no tabs in C:
contains(C,sprintf('\t'))
However, if you want to get the part between the last slash and the end (which corresponds to where the tabs are in the file), of each character vector in the first column of C, you can do this:
% match a slash followed by any non-slash characters, occurring at the end
% (to prevent matching previous slash+stuff parts)
years = regexp(C(:,1),'/([^/]*)$','tokens','once');
years = vertcat(years{:})
And to make the regular expression for the "D" part more general (allow matching anything between the double-quotes, not just digits), you can do this:
% match the literal "D":" and return any characters after that, before the next '"'
d = regexp(C(:,2),'"D":"(.*?)"','tokens','once');
d = vertcat(d{:})
More about Regular Expressions: Operators and Characters in Regular Expressions
Voss
Voss 2022 年 7 月 17 日
Another way to use regular expressions to achieve the same result would be to operate on the contents of the file directly, i.e., not on a cell array that comes from readcell.
Here's how that would work:
file_name = 'table_data.csv';
fid = fopen(file_name);
data = fread(fid,'*char').'; % read the file as a character vector
fclose(fid);
disp(data);
Part 1 Part 2 A/1/2/AB0000/2022abc """B"":""1"", ""C"":""2"", ""D"":""1""" A/1/2/AB0000/2021abc """B"":""1"", ""C"":""2"", ""D"":""2""" A/1/2/AB0000/2020abc """B"":""1"", ""C"":""2"", ""D"":""3""" A/1/2/AB0000/2019abc """B"":""1"", ""C"":""2"", ""D"":""4""" A/1/2/AB0000/2018abc """B"":""1"", ""C"":""2"", ""D"":""5"""
% match a slash, followed by any non-slash non-tab characters, followed by
% a tab, and return the stuff between the slash and the tab
% (note: not using 'once' this time because this regexp operates on the
% file's entire contents)
years = regexp(data,'/([^\t/]*)\t','tokens');
years = vertcat(years{:})
years = 5×1 cell array
{'2022abc'} {'2021abc'} {'2020abc'} {'2019abc'} {'2018abc'}
% match the literal '""D"":""', return the characters after that, before the next '""'
d = regexp(data,'""D"":""(.*?)""','tokens');
d = vertcat(d{:})
d = 5×1 cell array
{'1'} {'2'} {'3'} {'4'} {'5'}
Alexai
Alexai 2022 年 7 月 17 日
'A/1/2/AB0000/2022abc/2023'
If It has been added /2023 how can I read 2023
Voss
Voss 2022 年 7 月 17 日
Try to adapt some of the examples I've given. Refer to the link I shared.
Alexai
Alexai 2022 年 7 月 19 日
編集済み: Alexai 2022 年 7 月 19 日
um. sorry I have a mistake I want to read next slash ex)A/B/C0001/123456/abc0011
Thanksfully you give me a code
years = regexp(C(:,1),'/(\d{4})','tokens','once'); but I want to read next slash 123456->abc0011 How can I revise this code?
Voss
Voss 2022 年 7 月 19 日
file_name = 'table_data.csv';
C = readcell(file_name,'Delimiter','\t','NumHeaderLines',1)
C = 5×2 cell array
{'A/1/2/AB0000/2022abc'} {'"B":"1", "C":"2", "D":"1"'} {'A/1/2/AB0000/2021abc'} {'"B":"1", "C":"2", "D":"2"'} {'A/1/2/AB0000/2020abc'} {'"B":"1", "C":"2", "D":"3"'} {'A/1/2/AB0000/2019abc'} {'"B":"1", "C":"2", "D":"4"'} {'A/1/2/AB0000/2018abc'} {'"B":"1", "C":"2", "D":"5"'}
years = regexp(C(:,1),'/([^/]*/[^/]*)$','tokens','once');
years = vertcat(years{:})
years = 5×1 cell array
{'AB0000/2022abc'} {'AB0000/2021abc'} {'AB0000/2020abc'} {'AB0000/2019abc'} {'AB0000/2018abc'}

サインインしてコメントする。

Abderrahim. B
Abderrahim. B 2022 年 7 月 15 日

0 投票

Hi!
Try this:
% Create, split, and extract from part 1
part1 = "A/1/2/" + string(2022:-1:2018) ;
part1 = split(part1, '/') ;
yrs = part1(:,:,end) ;
% Create, split, and extract from part 2
part2 = " ""B"":""1"", ""C"":""2"", ""D"":""" + string(1:5) + '"' ;
part2 = split(part2, '"') ;
dig = str2double(part2(:,:,end-1)) ;
% Result
result = transpose(yrs + "," + dig )
result = 5×1 string array
"2022,1" "2021,2" "2020,3" "2019,4" "2018,5"
Use datetime if you want to convert this string array to date time array.
Please keep in mind this a way from many to solve it, you can also use regular experssion or patterns to do this.

質問済み:

2022 年 7 月 15 日

コメント済み:

2022 年 7 月 19 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by