Changing how DQN agent explores
2 ビュー (過去 30 日間)
古いコメントを表示
Hi,
I'm using a DQN agent with epsilon-greedy exploration. The problem is that my agent sees state 1 99% of the time, so it never learns to act in other states. By the time it learns to get to state 2 from state 1, epsilon has already decayed significantly and the agent gets stuck taking a sub-optimal action in state 2. Is there a way to implement some other form of exploration, like using a Boltzmann distribution? Thanks for your time.
2 件のコメント
Tanay Gupta
2021 年 7 月 13 日
Can you give a brief description of the states and the respective transitions?
回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で Training and Simulation についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!