回答済み
I want to print out multiple actions in reinforcement learning
Hi, If you want to create an agent that outputs multiple actions, you need to make sure the actor network is set up accordingly...

2年以上前 | 0

回答済み
Issue with Q0 Convergence during Training using PPO Agent
It seems you set the training to stop when the episode reward reaches the value of 0.985*(Tf/Ts)*3. I cannot comment on the valu...

2年以上前 | 2

| 採用済み

回答済み
Where is the actual storage location of the RL agent's weights.
Hello, You can implement the trained policy with automatic code generation, e.g. with MATLAB Coder, Simulink Coder and so on. Y...

2年以上前 | 0

回答済み
How do I find the objective/cost function for the example Valet parking using multistage NLMPC. (https://www.mathworks.com/help/mpc/ug/parking-valet-using-nonlinear-model-pred
Hi, The example you mentioned used MPC on two occasions: 1) On the outer loop for planning through the Vehicle Path Plannerblo...

2年以上前 | 0

回答済み
Replace RL type (PPO with DPPG) in a Matlab example
PPO is a stochastic agent whereas DDPG is deterministic. This means that you cannot just use actors and critics designed for PPO...

2年以上前 | 1

| 採用済み

回答済み
NMPC Controller not buildable for Raspberry Pi
Hard to tell without providing more details but I have a suspicion that you are defining the state and const functions as anonym...

2年以上前 | 0

回答済み
Regarding Default Terms in DNN
Which algorithm are you using? You can log loss data by following the guidelines here.

2年以上前 | 1

回答済み
How to start, pause, log information, and continue a simscape simulation?
If you go for #2, why don't you set it so that you have episodes that are 10 seconds long? When each episode ends, change the i...

2年以上前 | 0

回答済み
how to put some obstacles into my envrionment then to train my agent to avoid the obstacles and find a optimal path to follow using reiforment learning by simulink?
This example may be helpful.

2年以上前 | 0

回答済み
how to get the cost function result from model predictive controller?
Please take a look at the doc page of mpcmove. The Info output containts a field called Cost. You can use it to visualize how th...

2年以上前 | 0

| 採用済み

回答済み
The solution obtained with the nlmpcmove function of the mpc toolbox is not "reproducible"?
Hi, For problem 1: I am not sure what's inside that state function but presumably there is some integrator that gives you k+1....

2年以上前 | 0

回答済み
How to keep actions values at minimum before disturbance and let the agent choose different action values only after the disturbance?
Please take a look here. As of R2022a you can place the RL policy block inside a triggered subsystem and only enable the subsyst...

2年以上前 | 0

回答済み
How to set multiple stopping or saving criteria for RL agent?
This is currently not possible but keep an eye out on future releases - the development team has been working on this functional...

2年以上前 | 0

| 採用済み

回答済み
How to run the simulink model when implementing custom RL training?
The way to do it would be to use runEpisode

2年以上前 | 0

| 採用済み

回答済み
How to implement the custom training with DQN agent in Simulink environment?
I would recommend looking at the doc first to see how custom loops/agents are structured. The following links should be helpful:...

2年以上前 | 0

| 採用済み

回答済み
Time-varying policy function
Why don't you just train 3 separate policies and pick and choose as needed?

2年以上前 | 0

回答済み
Reinforcement Learning . Sudden very high Rewards during training of RL model.
You should first check the 'error' signal that you feed in the reward for those episodes. Could be that the error becomes too bi...

2年以上前 | 0

| 採用済み

回答済み
DDPG has two different policies
The comparison plot is not set up correctly. The noisy policy also has a noise state which needs to be propagated after each cal...

2年以上前 | 0

回答済み
Training is getting stuck halfway.
Hi, The error message seems to be longer than what you pasted. It appears there is an indexing error in the step method. Did no...

2年以上前 | 0

回答済み
How to pass external time-varying parameters to nonlinear MPC models?
Hello, There are two ways of doing this: 1) With Nonlinear MPC, you can set your time-varying parameters as measured disturban...

2年以上前 | 1

| 採用済み

回答済み
Why when I set the UseFastRestart = "on" and start train my reinforcement learning agent, the matlab crash manager comes out and matlab hast to close?
Not easy to answer without the crash log. Can you please contact technical support?

2年以上前 | 0

回答済み
MPC robotic arm with stepper motor control
The prediction model you provided has direct feedthrough which is not currently supported by Model Predictive Control Toolbox. W...

2年以上前 | 0

回答済み
How to include a model (created by me at Simulink) in Matlab script?
Hi, Currently you cannot use a Simulink model as prediction model for MPC design. This is something we are working towards for ...

2年以上前 | 0

回答済み
Setting initial conditions in MPC
To get the behavior you mentioned, the initial states of your plant and controller must be the same. If the initial conditions f...

2年以上前 | 0

回答済み
Model predictive controller (Time domain)?
Why don't you just use a larger sample time as you say? You can set it as long as you need it to be in seconds

2年以上前 | 0

| 採用済み

回答済み
Reinforcement learning/Experiecne buffer/Simulink
Why do you want to create your own buffer? If you are using the built-in DDPG agent, the buffer is created automatically for you...

2年以上前 | 0

回答済み
Non-linear Model Predictive Control Toolbox: manipulated variable remains constant
Well maybe that's the best the controller can do. I suggest removing the constraint on the manipulated variable temporarily and ...

2年以上前 | 0

| 採用済み

回答済み
Using NLMPC on vehicle dynamics
The error seems to be in your bus definition. You don't provide that so take a closer look and see if you set things properly. A...

2年以上前 | 0

| 採用済み

回答済み
how to improve a model predictive control in order to get a lower cost function for the system?
You basically want to get a more aggressive response if I understand correctly, meaning that your outputs will converge faster t...

2年以上前 | 0

| 採用済み

回答済み
About RL Custom Agent/ LQRCustomAgent example
Actually, exp is being indexed in exactly the same way. Only in the first example we are doing it in one line and in the second ...

2年以上前 | 1

| 採用済み