方策の展開

コード生成および学習済みの方策の展開

強化学習エージェントに学習させたら、最適な方策を展開するためのコードを生成できます。たとえば、MATLAB^® Coder™ と GPU Coder™ を使用すると、C++ または CUDA^® のコードを生成し、組み込みプラットフォームにニューラルネットワークの方策を展開できます。

方策からコードを生成する方法の概要については、Generate Code from Trained Reinforcement Learning Policiesを参照してください。展開済み方策の学習方法の概要については、Examine Approaches to Fine Tune a Deployed Policyを参照してください。

関数

`generatePolicyFunction`	Generate MATLAB function that evaluates policy of an agent or policy object
`generatePolicyBlock`	Generate Simulink block that evaluates policy of an agent or policy object (R2022b 以降)
`policyParameters`	Obtain structure of policy parameters to update policy during simulation or deployment (R2025a 以降)
`updatePolicyParameters`	Update policy according to structure of policy parameters given as input argument (R2025a 以降)

強化学習方策 (R2022b 以降)

強化学習のワークフロー
強化学習を問題に適用するために使用する一般的なワークフロー。
Generate Code from Trained Reinforcement Learning Policies
You can generate code for reinforcement learning agents using, for example, GPU Coder or MATLAB Coder.
Examine Approaches to Fine Tune a Deployed Policy
Select the best approach to train a policy in the real world.
Generate Policy Block for Deployment
Generate a policy block to deploy a trained policy.
Train Policy Deployed on Raspberry Pi
Use trainFromData in a MATLAB learning loop to train a policy deployed on a Raspberry Pi board.

Train a policy deployed on a Raspberry Pi® to control a Quanser QUBE™-Servo 2 inverted pendulum.

Verify a reinforcement learning agent in software-in-the-loop and processor-in-the-loop modes.