selfAttentionLayer can't process sequence-to-label problem?

Question

cui,xingxing 2024 年 1 月 5 日

1
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2066601-selfattentionlayer-can-t-process-sequence-to-label-problem

編集済み: cui,xingxing 約18時間前

selfAttentionLayer why can't handle the following simple sequence classification problem, already through the flattenLayer into one-dimensional data, on the contrary, lstm specify "outputMode" as "last" will pass.

% Here use simple data, for demonstration purposes only
XTrain = rand(3,200,1000); % dims "CTB"
TTrain = categorical(randi(4,1000,1));
% define my layers
numClasses = numel(categories(TTrain));
layers = [inputLayer(size(XTrain),"CTB");
    flattenLayer;
    selfAttentionLayer(6,48);
    % lstmLayer(20,OutputMode="last"); % use lstmLayer is ok!
    layerNormalizationLayer;
    fullyConnectedLayer(numClasses);
    softmaxLayer];
net = dlnetwork(layers);
% train network
lossFcn = "crossentropy";
options = trainingOptions("adam", ...
    MaxEpochs=1, ...
    InitialLearnRate=0.01,...
    Shuffle="every-epoch", ...
    GradientThreshold=1, ...
    Verbose=true);
netTrained = trainnet(XTrain,TTrain,net,lossFcn,options);
Error using trainnet
Number of observations in predictors (1000) and targets (1) must match. Check that the data and network are consistent.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

cui,xingxing 2024 年 1 月 7 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2066601-selfattentionlayer-can-t-process-sequence-to-label-problem#answer_1384691

編集済み: cui,xingxing 約18時間前

MATLAB Online で開く

In terms of the output feature map dimensions, there is a time "T" dimension that has to be eliminated in order to match the output dimensions, which can usually be done by indexing1dLayer. So the layers array is added before the fullyConnectedLayer.

% Here use simple data, for demonstration purposes only
XTrain = rand(3,200,1000); % dims "CTB"
TTrain = categorical(randi(4,1000,1));
% define my layers
numClasses = numel(categories(TTrain));
layers = [inputLayer(size(XTrain),"CTB");
    flattenLayer;
    selfAttentionLayer(6,48);
    % lstmLayer(20,OutputMode="last"); % use lstmLayer is ok!
    layerNormalizationLayer;
    
    indexing1dLayer; % Add this!!!
    fullyConnectedLayer(numClasses);
    softmaxLayer];
net = dlnetwork(layers);
% train network
lossFcn = "crossentropy";
options = trainingOptions("adam", ...
    MaxEpochs=1, ...
    InitialLearnRate=0.01,...
    Shuffle="every-epoch", ...
    GradientThreshold=1, ...
    Verbose=true);
netTrained = trainnet(XTrain,TTrain,net,lossFcn,options);
    Iteration    Epoch    TimeElapsed    LearnRate    TrainingLoss
    _________    _____    ___________    _________    ____________
            1        1       00:00:02         0.01          1.5374
            7        1       00:00:06         0.01          1.5272
Training stopped: Max epochs completed

-------------------------Off-topic interlude-------------------------------

I am currently looking for a job in the field of CV algorithm development, based in Shenzhen, Guangdong, China. I would be very grateful if anyone is willing to offer me a job or make a recommendation. My preliminary resume can be found at: https://cuixing158.github.io/about/ . Thank you!

Email: cuixingxing150@gmail.com

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

DGM 2024 年 3 月 5 日

Posted as a comment-as-flag by chang gao:

Useful answer.

jingwen 2024 年 4 月 15 日 11:45

Your answer helps me! Thank you

サインインしてコメントする。

selfAttentionLayer can't process sequence-to-label problem?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

selfAttentionLayer can't process sequence-to-label problem?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

5 件のコメント 3 件の古いコメントを表示3 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示