Procedure to link state path and action path in a DQL critic reinforcement learning agent?

3 ビュー (過去 30 日間)
Hi all
Some reinforcement learning examples available at the Mathworks page use a single output DQN agent as a critic.
See for instance in page
https://es.mathworks.com/help/reinforcement-learning/ref/rldqnagent.html
section: Create a DQN Agent Using a Single-Output Critic Representation
In this case the observation path and the action path are linked trough an additionLayer
I wonder why is used an additionLayer instead of a concatLayer?
It seems more appropriate to maintain input to the common path independently for each of the two parts: StatePath and ActionPath, given that, after junction, there are more layers in the common path to model critic output.
When would you recommend to use an addition layer to join state and action paths instead of using a concatenation layer?
Thanks in advance

採用された回答

Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis 2021 年 3 月 29 日
Hello,
Some comments on the points you raise above:
1.There are two ways to create the critic network for DQN as you probably saw in the doc page - one is using single output (Q value for provided input-action pair), and the other using multiple outputs (Q values for all possible actions for the specified input state). The latter is more efficient and is typically preferred
2.Concatenation and addition will both accomplish merging of features. Even with concatenation, you are only maintaining independent input paths on the surface, since the concatenation layer will also add the concatenated features. But your intuition is correct in that concatenation is more efficient:
-Addition: y = A1*u1 + b1 + A2*u2 + b2
-Concatenation: y = A*[u1;u2] + b = A1*u1 + A2*u2 + b
A couple of differences between the two:
a) The additionlayer requires a fullyconnected layer for each input branch, whereas concat merging only needs 1 fullyconnected layer downstream of the concat layer. So overall, concat layer needs less FC layers
b) You could choose to have FC layers as input to the concatenation layers similar to what you would need for addition. In that case, the preceding fully connected layers do not have to have the same number of nodes if you are using concatenation (unlike with addition)
c) As you can tell from the above equations, concatenation uses less parameters so it's more efficient in that sense
I think one of the advantages of addition is that is it more transparent/easily understood, which is why it's not being used in all the shipping examples
3. If you see that you are having difficulty creating neural network architectures manually, you can use the default agent feature. This features creates a default network architecture for you (you can always go back and change the architecture if you want).
Hope this helps

その他の回答 (0 件)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by