Comparing a million data from csv files takes too much time

4 ビュー (過去 30 日間)
Rizky Alfi
Rizky Alfi 2022 年 9 月 17 日
コメント済み: Rizky Alfi 2022 年 9 月 17 日
Hello everyone,
I am quite new to this program and need some help regarding this problem. I want to compare 1 million number to make sure there are no same number meet each other (n-1 ~= n). I tried to program the code, and using tic toc to measure time, elapsed time recorded is 40944.541765 seconds. This amount of time just for one csv file. actually i do want to make the code run for every csv file in the folder, but it is quite complicated so i just tried to focus to make calculation to one csv file first. How could i optimize this piece of code and make the calculation more accurate ? Thank You
data = csvread('data.csv',9); % Read the csv
a = zeros(1,999999); % Initialize a variable
for i=1:999998
t = data(i) ~= data(i+1); % make sure that n != n+1
a(i) = t; % Saving t value to a array
v=sum(a(:)==0); % Counting boolean 0 in a array
csvwrite('count.csv',v); % Writing the number to new csv file


Bhaskar R
Bhaskar R 2022 年 9 月 17 日
I assume, you want to calculate the number of nonzero difference data from one value to next to that value
We can do without loops, this may help you
data = randi(100, [1, 999999]); % taken a randon data of your data length
v = sum(diff(data) ~= 0);
Elapsed time is 0.022935 seconds.
  1 件のコメント
Rizky Alfi
Rizky Alfi 2022 年 9 月 17 日
Thank you sir. Actually I've tried to calculate it in microsoft excel first to make sure the matlab output is correct using =a2<>a1 in column B and =COUNTIF(B1:B1000000;"false"). Your answer is insightful. I've tried your answer but the adjustment i need to do is change the
v = sum(diff(data) ~= 0);
v = sum(diff(data8a) == 0);
to output the same output as microsoft excel. I will accept your answer. Thank you


その他の回答 (0 件)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by