Reinforcement learning actor is empty

Question

rr0101 2021 年 7 月 13 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/877453-reinforcement-learning-actor-is-empty

回答済み: Zuber Khan 2024 年 5 月 8 日

Hello everyone,

I am using the Reinforecement learning toolbox. I created my environment and used rlDQNAgent. But when I want to use the actor, the command getActor(Agent) gives actor =[]

I don't know what is the problem. Also after training I wanted to evaluate the agent by using the command getAction for different observations, but it always gives back the same action for all observations, this was not the case when I used the same command with the same observations before the training.

Any suggestions

Here is my code

ActionInfo = getActionInfo(env);
ObservationInfo = getObservationInfo(env);
dnn = [
    featureInputLayer(ObservationInfo.Dimension(2),'Normalization','none','Name','state')
    fullyConnectedLayer(24,'Name','CriticStateFC1')
    reluLayer('Name','CriticRelu1')
    fullyConnectedLayer(24, 'Name','CriticStateFC2')
    reluLayer('Name','CriticCommonRelu')
    fullyConnectedLayer(length(ActionInfo.Elements),'Name','output')];
criticOptions = rlRepresentationOptions('LearnRate',1e-4,'GradientThreshold',1,'L2RegularizationFactor',1e-4);
critic = rlQValueRepresentation(dnn,ObservationInfo,ActionInfo,...
    'Observation',{'state'},criticOptions);
agentOpts = rlDQNAgentOptions(...
    'UseDoubleDQN',false, ...    
    'TargetSmoothFactor',1, ...
    'TargetUpdateFrequency',4, ...   
    'ExperienceBufferLength',100000, ...
    'DiscountFactor',0.99, ...
    'MiniBatchSize',256);
% agentOptions = rlDDPGAgentOptions;
agent = rlDQNAgent(critic,agentOpts);
trainOpts = rlTrainingOptions(...
    'MaxEpisodes',500, ...
    'MaxStepsPerEpisode',30, ...
    'Verbose',false, ...
    'Plots','training-progress',...
    'StopTrainingCriteria','AverageReward',...
    'StopTrainingValue',30); 
trainingStats = train(agent,env,trainOpts);

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Zuber Khan 2024 年 5 月 8 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/877453-reinforcement-learning-actor-is-empty#answer_1454472

Hi,

Based on my understanding, you are getting an empty result while using "getActor(Agent)" command due to the usage of rlDQN agent. This is because DQN agent is a value-based reinforcement learning agent that trains a critic to estimate the expected discounted cumulative long-term reward when following the optimal policy.

Kindly note that value-based agents are agents that use only critics to select their actions and rely on an indirect policy representation. They use an approximator to represent a value function (value as a function of the observation) or Q-value function (value as a function of observation and action).

You can refer to the following documentation to understand more about DQN agents:

https://www.mathworks.com/help/reinforcement-learning/ref/rl.agent.rldqnagent.html

In order to know more about creating policies and value functions, kindly refer to the following documentation:

https://www.mathworks.com/help/reinforcement-learning/ug/create-policy-and-value-functions.html

As far as the second issue is concerned, since you are getting the same action for all observations, it means that the agent has not trained properly in the given environment. It is possible that the agent may not have learned a sufficiently diverse policy. This could be due to not training long enough, the complexity of the environment, or the chosen architecture and hyperparameters not being optimal.

I would suggest you to look closely at the training progress plots and metrics to ensure that the agent is learning effectively over time. If the performance plateaus early or doesn't improve, consider adjusting the network architecture, agent options, training options or other involved hyperparameters. Also, ensure that when you are evaluating the agent with "getAction", the observations you provide are significantly different and cover the state space well. Sometimes, subtle differences in observations might not lead to different actions, especially if the Q-values are close.

Further, you can try even a different type of RL agent that might be better suited to your environment.

Since you have not explictly provided the environment, it is not possible to debug the given code. Therefore, I have given a generic response.

I hope this will help you in resolving your issue.

Regards,

Zuber

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Reinforcement learning actor is empty

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

Reinforcement learning actor is empty

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示