SARSA Reinforcement Learning

バージョン 1.0.0.0 (117 KB) 作成者: Bhartendu

Maze solving using SARSA, Reinforcement Learning

フォロー

5.0

(5)

ダウンロード: 1.7K

更新 2017/5/24

ライセンスの表示

Refer to 6.4 (Sarsa: On-Policy TD Control), Reinforcement learning: An introduction, RS Sutton, AG Barto , MIT press
In this demo, two different mazes have been solved by Reinforcement Learning technique, SARSA.
State-Action-Reward-State-Action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning.
SARSA, Updation of Action-Value Function:

Q(S{t}, A{t}) := Q(S{t}, A{t}) + α*[ R{t+1} + γ ∗ Q(S{t+1}, A{t+1}) − Q(S{t}, A{t}) ]

Learning rate (α)
The learning rate determines to what extent the newly acquired information will override the old information. A factor of 0 will make the agent not learn anything, while a factor of 1 would make the agent consider only the most recent information.

Discount factor (γ)
The discount factor determines the importance of future rewards. A factor of 0 will make the agent "opportunistic" by only considering current rewards, while a factor approaching 1 will make it strive for a long-term high reward. If the discount factor meets or exceeds 1, the Q values may diverge.

Note: Convergence is tested on particular examples, in general convergence is not sure for above demo.

引用

Bhartendu (2026). SARSA Reinforcement Learning (https://jp.mathworks.com/matlabcentral/fileexchange/63089-sarsa-reinforcement-learning), MATLAB Central File Exchange. 取得日: 2026/1/24.

MATLAB リリースの互換性

作成: R2016a

すべてのリリースと互換性あり

プラットフォームの互換性

Windows macOS Linux

タグタグを追加

バージョン	公開済み	リリースノート
1.0.0.0	2017/5/24		ダウンロード

SARSA Reinforcement Learning

引用

MATLAB リリースの互換性

プラットフォームの互換性

カテゴリ

タグタグを追加

ライブエディターを体験する

SARSA Reinforcement Learning

引用

MATLAB リリースの互換性

プラットフォームの互換性

カテゴリ

タグ タグを追加

ライブ エディターを体験する

タグタグを追加

ライブエディターを体験する