How to let the reinforcement learning agent know exactly what action it takes?

Question

Aaron Bramhasta 2024 年 11 月 5 日 17:06

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2164170-how-to-let-the-reinforcement-learning-agent-know-exactly-what-action-it-takes

回答済み: Maneet Kaur Bagga 約11時間前

Model.zip

Dear Matlab Experts,

I am currently running a reinforcement learning simulation, integrated with a discrete events system of simulink. My main simulation of the discrete events utilizes bus element containing multiple entites that some will serve as an observation for the RL agent (via conversion entity -> signal) and to impose the action the RL agent chooses (via conversion signal -> entity). I imposed some policy in the DES where given a certain requirements, the entity value will be assigned, that will switch an entity gate to determine which course of action to take. However, my reinforcement learning agent does not seem to understand this rule, as it assigns the entity value randomly from the values available. Is there a way to apply this rule that is present in the DES, to somehow make the same rule understandable by the RL agent?

Thank you so much in advance! I am attaching my model for reference.

Best regards,

Aaron.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Maneet Kaur Bagga 約11時間前

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2164170-how-to-let-the-reinforcement-learning-agent-know-exactly-what-action-it-takes#answer_1547838

MATLAB Online で開く

Hi,

As per my understanding, the issue is encountered because your DES contains specific policies, such as switching gates based on entity attributes. These rules are likely hard-coded and not inherently part of the RL environment's observation or reward structure. The RL agent explores actions based on the provided observations and the learned policy.

Please refer to the following workaround for the same:

Incorporate Rule into Observations: Add flags or variables that indicate the rule's state (e.g "Gate should switch" = 1/0). Ensure these conditions are dynamically updated during simulation.

Augment Reward Structure: Add a penalty or reward for actions that align with or violate the DES rules. This encourages the RL agent to learn behaviors aligned with the rules.

reward = reward + (agentAction == expectedAction) * rewardFactor;

Pretrain the Agent: Use supervised learning to pretrain the RL agent to follow the DES rules as a baseline policy. Later, fine-tune with reinforcement learning.

Custom Environment Dynamics: Modify the environment (DES model) such that the DES rules are enforced during interaction. For instance, override the agent’s selected action if it violates a rule.

if violatesRule(action, currentState)
    action = enforceRule(currentState);
end

Regularization: Include constraints in the training process that mimic the DES rules. For example, ensure that the policy network outputs actions adhering to the rules.

loss = loss + ruleViolationPenalty * countViolations(actions, state);

Rule-Based Hybrid Approach: Use "rlAgent.getAction" to test the agent's action in specific scenarios and compare it against the DES policy to identify mismatches.

Please refer to the following MathWorks documentation of "rlAcAgent.getAction" for better understanding:

https://in.mathworks.com/help/reinforcement-learning/ref/rl.agent.rlacagent.html

Hope this helps!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

How to let the reinforcement learning agent know exactly what action it takes?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

How to let the reinforcement learning agent know exactly what action it takes?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

回答 (1 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示