How to make my algorithm work faster

6 ビュー (過去 30 日間)
Urs Grigore
Urs Grigore 2017 年 8 月 18 日
回答済み: Urs Grigore 2017 年 8 月 24 日
Hello everyone! Sorry if this is a problem with a very simple solution but I'm quite new to matlab and to programming. I have an algorithm in which I need merge 2 very big tables. I need to do this merge 3 times and each time is a little bit different. I made 3 scripts for each merge with a while in which I compare the value on the first column in both tables and i need to add the lower one plus some values in the new table. This takes a lot and i would be really thankful if someone could help me make it work faster. This is my first question on this forum and any tips about how to create a question would be helpful as well.
  6 件のコメント
Urs Grigore
Urs Grigore 2017 年 8 月 23 日
Hello again! Sorry for taking so long to reply. I added some comments, i hope what i'm trying to do is clear now.
%i have 2 tables with different timestamps and different values
%for each timestamp
%i need to have 1 table with all the timestamps and the values from both tables
%when i add a timestamp from the first table to the the table i need to have
%i will add the values from the HIGHEST timestamp from 2nd table which
%is LOWER or EQUAL to the timestamp from the first table
%same works when i add a timestamp from the 2nd table
hTabel1=height(Tabel1) ; %hTabel1 is the hight of the first table, i use it
%to know until where should i go with the while
hTabel2=height(Tabel2) ; %same
kTabel1=1; %kTabel1 represents the line i am working with from the first table
kTabel2=1; %kTabel2 represents the line i am working with from the second table
if Tabel1.Var1(1)>Tabel2.Var1(1)
kTabel2 = find(Tabel2.Var1 <= Tabel1.Var1(1), 1);
kTabel2=kTabel2+1;
kTabel1=kTabel1+1;
end
%on the first column in each tables i have a timestamp calculated in miliseconds
%i will bring the two variables representing the line in tables to consecutive values
%and increase them with 1 because i work with the last values i had
if Tabel1.Var1(1)<Tabel2.Var1(1)
kTabel1 = find(Tabel1.Var1 <= Tabel2.Var1(1), 1);
kTabel1=kTabel1+1;
kTabel2=kTabel2+1;
end
%same thing as above
TIMP=0;
LMAX_EURUSD_USDJPY_bid=0;
LMAX_EURUSD_USDJPY_ask=0;
Tabel3=table(TIMP,LMAX_EURUSD_USDJPY_bid,LMAX_EURUSD_USDJPY_ask);
kTabel3=1;
%again the line i am in Table3 is represented by kTable3
while kTabel1<=hTabel1 || kTabel2<=hTabel2
%while i still have values in both tables
ok=false;
if Tabel1.Var1(kTabel1) < Tabel2.Var1(kTabel2)
kTabel1=kTabel1+1;
time=Tabel1.Var1(kTabel1-1);
ok=true;
end
%check if the line i am at in the first table has a lower Timestamp
%than the line in the 2nd table
if ~ok && (Tabel1.Var1(kTabel1) == Tabel2.Var1(kTabel2))
kTabel1=kTabel1+1;
kTabel2=kTabel2+1;
time=Tabel1.Var1(kTabel1-1);
ok=true;
end
%check if the line i am at in the first table has an equal Timestamp
%to the one in the 2nd table
if ~ok && Tabel1.Var1(kTabel1) > Tabel2.Var1(kTabel2)
kTabel2=kTabel2+1;
time=Tabel2.Var1(kTabel2-1);
end
%check if the line i am at in the 2nd table has a lower Timestamp
%than the line in the 1st table
%add the values in the Table3 and increase the line counter.
cell={time,Tabel1.Var3(kTabel1-1)*Tabel2.Var3(kTabel2-1),Tabel1.Var5(kTabel1-1)*Tabel2.Var5(kTabel2-1)};
Tabel3(kTabel3,:)=cell;
kTabel3=kTabel3+1;
end
The tables with Var1 etc. are read from files and this is the value i get for them, that's why i use it this way, for the tables i create i started to use specific names for columns. Also, i want to say thanks to everyone who gave me any tips here and if someone is willing to help me via skype or any other voice chat, would be amazing. I only worked in c++ and a little bit in python and i'm having some issues getting used to Matlab.
Stephen23
Stephen23 2017 年 8 月 23 日
" I only worked in c++ and a little bit in python and i'm having some issues getting used to Matlab"
Forget everything you know about C++ and Python: they work in totally different ways to MATLAB.
The introductory tutorials are the recommended way to learn important MATLAB concepts:

サインインしてコメントする。

回答 (2 件)

Jan
Jan 2017 年 8 月 23 日
編集済み: Jan 2017 年 8 月 23 日
Addressing the field of the table costs time. Because you read only in .Var1 in both tables, you can use a temporary variable efficiently:
T1 = Table1.Var1;
T2 = Table2.Var1;
The term "Table" occurs very frequently in the code such that is looks rather redundant. The naming of variables is a question of taste, but everything which improves the readability might be an advantage for understanding the code. Sometimes a patterns in the code get clear with a better readability. I prefer "k1" instead of "kTable1".
Replace:
kTabel1 = 1;
kTabel2 = 1;
if Tabel1.Var1(1)>Tabel2.Var1(1)
while Tabel1.Var1(kTabel1)>Tabel2.Var1(kTabel2)
kTabel2=kTabel2+1;
end
kTabel1=kTabel1+1;
end
by:
k1 = 1;
k2 = 1;
if T1(1) > T2(1)
k2 = find(T1(1) > T2, 1);
k1 = 2;
end
In opposite to the first version of your code, Table3 is not pre-allocated before the loop in the last version. This is slow down the processing substantially. The iterative growing of arrays requires an exponentially growing amount of resources. This was better - except for the name:
Tabel3 = zeros(hTabel1+hTabel2+5, 3);
I have only a few experiences with working in tables. I guess the creation of a double matrix is faster. Then you can create the table after the loop in one step.
"cell" is an important builtin function. Shadowing it by a local variable is not an error, but confusing.
cell = {time, Tabel1.Var3(kTabel1-1)*Tabel2.Var3(kTabel2-1), ...
Tabel1.Var5(kTabel1-1)*Tabel2.Var5(kTabel2-1)};
Tabel3(kTabel3,:)=cell;
Or I assume this is faster:
Tabel3(k3, 1) = time;
Tabel3(k3, 2) = T1V3(k1 - 1) * T2V3(k2 - 1);
Tabel3(k3, 3) = T1V5(k1 - 1) * T2V5(k2 - 1);
With "T1V3" was set as shortcut to "Table1.Var3".
Instead of:
ok = false;
if xyz
ok=true;
end
if ~ok && abc
ok=true;
end
if ~ok ...
you can write:
if xyz
...
elseif abc
...
elseif ...
This will not reduce the runtime a lot, but it is nicer to read.
  2 件のコメント
Urs Grigore
Urs Grigore 2017 年 8 月 23 日
編集済み: Urs Grigore 2017 年 8 月 23 日
Thanks a lot for the tips! The thing is that i also need the rest of the values from the table in the last while loop so I guess I cannot do that. I already changed the first 2 while loops and I am using the find function now.The last while is the one that takes the most time. Thanks a lot!
Jan
Jan 2017 年 8 月 23 日
I guess I cannot do that.
Cannot do what? A pre-allocation is essential.
Note that it is much easier to improve your code, when we can run it. So provide some representative inputs.

サインインしてコメントする。


Urs Grigore
Urs Grigore 2017 年 8 月 24 日

This is part of my first script:

%LMAX_CFH
lim=86400000000;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
T11=LMAX_EURUSD.Var1;
T12=LMAX_EURUSD.Var3;
T13=LMAX_EURUSD.Var5;
T21=LMAX_USDJPY.Var1;
T22=LMAX_USDJPY.Var3;
T23=LMAX_USDJPY.Var5;
table_merge1();
LMAX_EURUSD_USDJPY=T3;

This is my table_merge1 :

        %i have 2 tables with different timestamps and different values
        %for each timestamp
        %i need to have 1 table with all the timestamps and the values from both tables
        %when i add a timestamp from the first table to the the table i need to have
        %i will add the values from the HIGHEST timestamp from 2nd table which 
        %is LOWER or EQUAL to the timestamp from the first table
        %same works when i add a timestamp from the 2nd table
        h1=height(LMAX_EURUSD) ; %h1 is the height of the first table, i use it 
                                 %to know until where should i go with the while
        h2=height(LMAX_USDJPY) ; %same
        k1=1; %k1 represents the line i am working with from the first table
        k2=1; %k2 represents the line i am working with from the second table
        if T11(1)>T21(1)
            k2 = find(T11(1) > T21, 1);
            k2=k2+1;
            k1=2;
            k2=k2+1;
        end
        %on the first column in each tables i have a timestamp calculated in miliseconds
        %i will bring the two variables representing the line in tables to consecutive values
        %and increase them with 1 because i work with the last values i had
        if T11(1)<T21(1)
            k1 = find(T11 <= T21(1), 1);
            k1=k1+1;
            k2=2;
        end
        %same thing as above
        TIMP=0;
        LMAX_EURUSD_USDJPY_bid=0;
        LMAX_EURUSD_USDJPY_ask=0;
        T3=table(TIMP,LMAX_EURUSD_USDJPY_bid,LMAX_EURUSD_USDJPY_ask);
        k3=1;
        %again the line i am in T3 is represented by k3
        while k1<=h1 || k2<=h2
            %while i still have values in both tables
            if  T11(k1) < T21(k2)
                k1=k1+1;
                time=T11(k1-1);
            elseif (T11(k1) == T21(k2))
                k1=k1+1;
                k2=k2+1;
                time=T11(k1-1);
            elseif T11(k1) > T21(k2)
                k2=k2+1;
                time=T21(k2-1);
            end
            %check if the line i am at in the 2nd table has a lower Timestamp 
            %than the line in the 1st table
            %add the values in the T3 and increase the line counter.
            %cell={time,T12(k1-1)*T22(k2-1),T13(k1-1)*T23(k2-1)};
            T3.TIMP(k3)=time;
            T3.LMAX_EURUSD_USDJPY_bid=T12(k1-1)*T22(k2-1);
            T3.LMAX_EURUSD_USDJPY_ask=T13(k1-1)*T23(k2-1);
            k3=k3+1;
        end

The files are too big to add them here so I uploaded them to dropbox LMAX_EURUSD and LMAX_USDJPY

カテゴリ

Help Center および File ExchangeTables についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by