What is the alternative function for evalin?

Question

0 投票

I want to improve the performance of my function.

Avoid functions such as eval, evalc, evalin, and feval(fname). Use the function handle input to feval whenever possible. Indirectly evaluating a MATLAB expression from text is computationally expensive.

I 'm using the following to read data from workspace to function file.

A= evalin('base','dataset');

Is there any alternative way for reading variables from workspace to Matlab function that is efficient.

2 件のコメント
なしを表示なしを非表示

Stephen23 2021 年 8 月 16 日

"Is there any alternative way for reading variables from workspace to Matlab function that is efficient."

By far the most efficient approach (as well as being simpler, much more robust, and easier to debug) is to pass all data as input/output variables:

https://www.mathworks.com/help/matlab/matlab_prog/share-data-between-workspaces.html

Better code design does not just mean swapping some function X for another mysteriously faster function Y with exactly the same behavior, it means actually designing your data and code following best-practices (following this forum is a good place to start learning those).

Walter Roberson 2021 年 8 月 16 日

And see my newest test just posted here that shows passing parameters through multiple levels is faster.

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Walter Roberson 2021 年 8 月 14 日

MATLAB Online で開く

2 投票

Shared variables are 2 to 5 times faster than evalin('base')

test_base()
ans = 5.7348e-06
ans = 2.7770e-06
function test_base
    A = rand(1e7,1);
    assignin('base', 'A', A);
    
    timeit(@() via_evalin(), 0)
    
    timeit(@() via_shared(), 0)
    
    function B = via_evalin()
        B = evalin('base', 'A');
    end
    
    function B = via_shared()
        B = A;
    end
    
end

7 件のコメント
5 件の古いコメントを表示 5 件の古いコメントを非表示

Walter Roberson 2021 年 8 月 14 日

MATLAB Online で開く

In different runs, the first bar chart is fairly variable; it is difficult to be certain what the relative timings are, but evalin() did tend to be slightly the slowest.

The first bar chart is timing making data available into a function and using the data inside the function (to avoid the possibility of the execution engine optimizing away the call if you did not use the variable.)

The second bar chart is timing just making data available into a function, without using the data inside the function.

Effectively, the time to use the data inside the function pretty much overwhelms the time to pass the data into the function, except possibly for very short functions or small data.

However.. evalin() is consistently the slowest when no work is being done on the data.

Whether using a passed parameter or using a shared variable is faster cannot reliably be determined from this particular test.

test_base()

t = 6×1

14.1588 13.3560 13.4462 0.0073 0.0017 0.0019

function test_base

N = 500;

A = rand(1e7,1);

assignin('base', 'A', A);

t = zeros(6,1);

tic; for K = 1:N; via_evalin(); end; t(1) = toc;

tic; for K = 1:N; via_shared(); end; t(2) = toc;

tic; for K = 1:N; passed_in(A); end; t(3) = toc;

tic; for K = 1:N; via_evalin_no_work(); end; t(4) = toc;

tic; for K = 1:N; via_shared_no_work(); end; t(5) = toc;

tic; for K = 1:N; passed_in_no_work(A); end; t(6) = toc;

t

cats = categorical({'evalin', 'shared variable', 'passed parameter', 'evalin no work', 'shared no work', 'passed no work'});

figure(); bar(removecats(cats(1:3)), t(1:3))

figure(); bar(removecats(cats(4:6)), t(4:6))

function B = via_evalin()

B = evalin('base', 'A').^2;

end

function B = via_shared()

B = A.^2;

end

function B = passed_in(A)

B = A.^2;

end

function B = via_evalin_no_work()

B = evalin('base', 'A');

end

function B = via_shared_no_work()

B = A;

end

function B = passed_in_no_work(A)

B = A;

end

Walter Roberson 2021 年 8 月 15 日

Does your MATLAB version spontaneously create A1 and A2 in the base workspace? Or do you have code that creates A1 and A2 in the base workspace?

If your A1 and A2 spontaneously come into existence in your base workspace, then you should continue to use evalin('base') as any changes to your code could cause the miracle to stop working.

Using alternatives to evalin('base') is strictly for the case where you have code that creates the variables in the base workspace, and that version would look like

function this_is_the_outer_function

   at this point use your code that reads
   in or creates A1 and A2 and store them as
   local variables instead of in base workspace

    x0 = 3;
    best_x = fminsearch(@prob, x0);
    disp(best_x)

    function F = prob(x)
       B1 = A1;
       B2 = A2;
       F = det(B1*x+ B2);
    end
  end

You have to rewrite so that you do not store the values in the base workspace in the first place.

If for some reason you feel that you cannot rewrite to avoid using the base workspace, then you should keep doing what you are doing now. There is no alternative code for fetching variables from the base workspace, there is just evalin('base'). The point is that you should not be putting data into the base workspace if you want the highest performance.

Ammy 2021 年 8 月 16 日

Thank you.

Walter Roberson 2021 年 8 月 16 日

MATLAB Online で開く

It is a fair question to ask what the best performance is if you have multiple levels of calls.

The following code goes through 5 calling levels before doing the "work" (fetching the value and assigning to output). Exact timings depend upon the run, but it seems consistent that creating 5 levels of shared variables is the slowest of these three, and that passing as a parameter is the fastest.

In the earlier tests, sharing through one level of nested function was nearly indistinguishable from passing as parameters; this test tells us that there is a performance penalty to having to find the shared variable through multiple levels.

@James Tursa @Jan

format long g

test_base();

t = 3×1

0.005531 0.00831 0.002545

function test_base

N = 500;

A = rand(1e7,1);

assignin('base', 'A', A);

t = zeros(3,1);

tic; for K = 1:N; via_evalin1(); end; t(1) = toc;

tic; for K = 1:N; via_shared1(); end; t(2) = toc;

tic; for K = 1:N; via_passed1(A); end; t(3) = toc;

t

cats = categorical({'evalin', 'shared variable', 'passed parameter'});

bar(cats, t);

function B = via_evalin1()

B = via_evalin2();

end

function B = via_evalin2()

B = via_evalin3();

end

function B = via_evalin3();

B = via_evalin4();

end

function B = via_evalin4();

B = via_evalin5();

end

function B = via_evalin5();

B = evalin('base', 'A');

end

function B = via_shared1()

B = via_shared2();

function B = via_shared2()

B = via_shared3();

function B = via_shared3();

B = via_shared4();

function B = via_shared4();

B = via_shared5();

function B = via_shared5();

B = A;

end

function B = via_passed1(A)

B = via_passed2(A);

end

function B = via_passed2(A)

B = via_passed3(A);

end

function B = via_passed3(A);

B = via_passed4(A);

end

function B = via_passed4(A);

B = via_passed5(A);

end

function B = via_passed5(A);

B = A;

end

サインインしてコメントする。

Answer 2

Jan 2021 年 8 月 17 日

MATLAB Online で開く

3 投票

You asked clearly for the runtime of the code. Distributing a variable by evalin() is a drawback, as explained by others already. It is faster to design a program such, that all used variables are provided by input and output arguments. This is the reason, why e.g. almost all functions of Matlab's toolboxed use this method. (Exceptions: disp and save, which are driven by using the names of the variables instead of their values.)

Stephen Cobeldick hit an important point in his comment: evalin() makes the code hard to debug and to maintain. If your code is useful, you will use it in other applications. Over the time such codes will grow. In my case some tiny scripts with 20 lines of code, which fixed some inputs taken from Excel files, grow to 227'000 lines of Matlab code since 1999. Now imagine that some of the functions use variables stored in the base workspace called "A1", and the GUI of this propgram is open, while some users run scripts in the command window, which use "A1" as name of a variable by accident. This will cause unexpected outputs and it nearly impossible to debug this.

Storing variables in the base workspace and accessing them by evalin is a source of serious troubles. The debugging may need an hour, or in my case some weeks, because I would have to visit the lab, which uses my software and they will not be able to reproduce the problem until the other user runs its script again. Even if the runtime with evalin() would be some microseconds faster than with other methods, an instable code design can slow down the time to get the results dramatically.

Remember: We write code to solve problems. The time, until the problem is solved, is ths sum:

t_solved = t_designing + d_programming + t_documenting + ...
           t_debugging + t_optimizing + t_runtime

Trying to improve the processing speed, while the code is still fragile dur to evalin's is called "premature optimizing" and a common reason for instable and in consequence not usable code.

Your question shows, that it is time to enter the next level of programming by learing, how to design code efficiently. Software engineering is important and more or less unknown for too many scientists. I've seen a lot of programs developped for e.g. a PhD, which are an undocumented pile of scripts which must be called in a specific order to obtain some results. As soon as the PhD is finished and the author lost the post-it on which the needed order was written, this software is useless and worth to be deleted soon.

My answer:

Do not try to improve the runtime, but spend your time to refactor the complete code, such that all variables are provided either by inputs and outputs, of by sharing the variables using inlined functions. Take into account, that good code will be reused in other projects, and this is only possible, if the functions have a well designed and documented interface. evalin() is a secure indicator for messy code, so find a way to avoid it.

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

Answer 3

Matt J 2021 年 8 月 14 日

0 投票

You could pass 'dataset' as an input argument to your function.

4 件のコメント
2 件の古いコメントを表示 2 件の古いコメントを非表示

Walter Roberson 2021 年 8 月 14 日

In the test I just posted, I was not able to reliably determine whether shared variables or passed parameters were faster. Mathworks documents passed parameters as being faster.

When I used timeit() instead of tic/toc then the variation between runs was much greater than the difference between the timings.

Jan 2021 年 8 月 17 日

@Walter Roberson: "When I used timeit() instead of tic/toc then the variation between runs was much greater than the difference between the timings."

This is interesting.There is a reason for the variation and for the different timings. I assume, that you did not test this with a high CPU load from other applications. According to my experiences with the undocumented JIT acceleration, both methods use a direct addressing of the variables and do not need to search in the lookup table (this is needed e.g. for eval()'ed variables). Then I'd expect more constant timings.

I'm going to run your test cases on a machine without speed-stepping of the CPU. Maybe Matlab is such efficient, that the CPU falls asleep during the processing.

サインインしてコメントする。

What is the alternative function for evalin?

2 件のコメント
なしを表示なしを非表示

採用された回答

7 件のコメント
5 件の古いコメントを表示 5 件の古いコメントを非表示

その他の回答 (2 件)

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

4 件のコメント
2 件の古いコメントを表示 2 件の古いコメントを非表示

カテゴリ

タグ

Community Treasure Hunt

What is the alternative function for evalin?

2 件のコメント なしを表示 なしを非表示

採用された回答

7 件のコメント 5 件の古いコメントを表示 5 件の古いコメントを非表示

その他の回答 (2 件)

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

4 件のコメント 2 件の古いコメントを表示 2 件の古いコメントを非表示

カテゴリ

タグ

参考

Community Treasure Hunt

2 件のコメント
なしを表示なしを非表示

7 件のコメント
5 件の古いコメントを表示 5 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

4 件のコメント
2 件の古いコメントを表示 2 件の古いコメントを非表示