Does the selfattentionLayer also perform softmax and scaling?

Question

Chih 2023 年 4 月 3 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1940374-does-the-selfattentionlayer-also-perform-softmax-and-scaling

編集済み: cui,xingxing 2024 年 4 月 27 日

採用された回答: Rohit

In https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.selfattentionlayer.html, it states that:

A self-attention layer computes single-head or multihead self-attention of its input.

The layer:

Computes the queries, keys, and values from the input
Computes the scaled dot-product attention across heads using the queries, keys, and values
Merges the results from the heads
Performs a linear transformation on the merged result

I wonder if the layer also apply softmax to the scaling (i.e. divide (Q*K) by sqrt(dim))? My understanding is that, within step 2, this softmax and scaling should happen.

Please clarify that for me or more general users.

Thanks.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Rohit 2023 年 4 月 20 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1940374-does-the-selfattentionlayer-also-perform-softmax-and-scaling#answer_1219478

I understand that you want to know whether ‘selfAttentionLayer’ performs softmax and scaling operations which are involved to compute attention score.

Yes, we perform both operations to compute scaled attention score and then apply softmax as required in attention mechanism.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Chih 2023 年 4 月 20 日

Thank you very much, Rohit.

サインインしてコメントする。

Answer 2

cui,xingxing 2024 年 1 月 11 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1940374-does-the-selfattentionlayer-also-perform-softmax-and-scaling#answer_1387576

編集済み: cui,xingxing 2024 年 4 月 27 日

Hi,@Chih

Please check out the details of the code I wrote here link.

-------------------------Off-topic interlude, 2024-------------------------------

I am currently looking for a job in the field of CV algorithm development, based in Shenzhen, Guangdong, China,or a remote support position. I would be very grateful if anyone is willing to offer me a job or make a recommendation. My preliminary resume can be found at: https://cuixing158.github.io/about/ . Thank you!

Email: cuixingxing150@gmail.com

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Does the selfattentionLayer also perform softmax and scaling?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (1 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

Does the selfattentionLayer also perform softmax and scaling?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (1 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示