Speed optimization of partial inner product (norm)

For a row vector, the norm can be written as
sqrt(sum(P.^2))
or
sqrt(P*P')
The latter is about twice as fast. Now I have a 4D matrix with dimensions [100,100,100,70], and would like to take the norm of the last dimension to yield a matrix of dimension [100,100,100]. This works:
sqrt(sum(P.^2,4))
but is too slow. Does anyone know a way to speed this up (perhaps in a similar way as the 1D case?)

 採用された回答

Matt J
Matt J 2014 年 2 月 28 日

0 投票

5 件のコメント

lvn
lvn 2014 年 2 月 28 日
Thanks a lot! I just tested this function and it works, but unfortunately my data is in single format and converting it to double first makes it again very slow.
Matt J
Matt J 2014 年 2 月 28 日
編集済み: Matt J 2014 年 2 月 28 日
I'm sure it wouldn't be hard to modify the code to accept singles.
Jan
Jan 2014 年 2 月 28 日
So you could ask the author, Ivn. Do you want the output to be in single format also?
Matt J
Matt J 2014 年 2 月 28 日
編集済み: Matt J 2014 年 2 月 28 日
Jan, if you're going to take that modification on, I would just request that the summations/accumulations in the norm calculation still be done in double precision, regardless of the class of the input/output (or that there be an option to do so).
I also vote that the output class should match the input class.
lvn
lvn 2014 年 4 月 17 日
Jan, I am sorry I didn't see your comment until now (after I tagged the question answered, I didn't open it anymore).
This norm is still a bottleneck in my program and I would therefore be very grateful if you could make a version with both in and output having single format.

サインインしてコメントする。

その他の回答 (1 件)

Ernst Jan
Ernst Jan 2014 年 2 月 28 日

0 投票

My results show that the first is actually faster:
n = 10000;
P1 = rand(1,n);
tic
A1 = sqrt(sum(P1.^2));
toc
tic
A2 = sqrt(P1*P1');
toc
tic
A3 = sqrt(sum(P1.*P1));
toc
P2 = rand([100,100,100,70]);
tic
A4 = sqrt(sum(P2.*P2,4));
toc
tic
A5 = sqrt(sum(P2.^2,4));
toc
Elapsed time is 0.000044 seconds.
Elapsed time is 0.000141 seconds.
Elapsed time is 0.000031 seconds.
Elapsed time is 0.307783 seconds.
Elapsed time is 0.309741 seconds.
Please provide a code example?

2 件のコメント

lvn
lvn 2014 年 2 月 28 日
Here are my results:
>> P=rand(100,1);
>> tic; for k=1:1000000, sqrt(P'*P); end; toc;
Elapsed time is 0.918256 seconds.
>> tic; for k=1:1000000, sqrt(sum(P.^2)); end; toc;
Elapsed time is 1.533144 seconds.
But to be clear, the question is related more specifically to the 4D case.
Matt J
Matt J 2014 年 2 月 28 日
編集済み: Matt J 2014 年 2 月 28 日
@Ernst
You're using way too small a value of n to see a meaningful comparison. Here's what I get with n=1e7
Elapsed time is 0.031045 seconds.
Elapsed time is 0.008693 seconds.
Elapsed time is 0.030998 seconds.

サインインしてコメントする。

カテゴリ

ヘルプ センター および File ExchangeSpecial Functions についてさらに検索

質問済み:

lvn
2014 年 2 月 28 日

コメント済み:

lvn
2014 年 4 月 17 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by