Live Monitoring of Critic Predictions in the RL Toolbox
1 回表示 (過去 30 日間)
古いコメントを表示
I'm wondering if it is possible to monitor the Q-value predictions within any critic-based RL approach using the RL toolbox? For example, having a multi-output DQN agent the internal deep NN has to be called at every step in order to evaluate all possible discrete actions given the current state sample - hence, somewhere internally there must be a Q-value prediction for every discrete action available which are then evaluated in order to find the optimal action.
However, having spend some time on the 2020a documentation I was not able to find a way accessing these internal Q-value predictions at each time step. In particular, it would be nice if the Simulink-based agent block would be able to provide these predictions for further processing and monitoring reasons during the training and deployment phase.
Does somebody have a useful hint in order to retrieve the Q-value estimates during learning?
0 件のコメント
回答 (0 件)
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!