Main Content

Log Training Data To Disk

This example shows how to log data to disk when using the train function to train agents in the Reinforcement Learning Toolbox™.

Overview

The general steps for data logging are:

  1. Create a data logger object using the rlDataLogger function.

  2. Configure the data logger object with callback functions to specify the data to log at different stages of the training process.

  3. Specify the logger object as a name-value input argument in the train function.

Create Data Logger

Create a data logger object using the rlDataLogger function.

fileLogger = rlDataLogger()
fileLogger = 
  FileLogger with properties:

           LoggingOptions: [1×1 rl.logging.option.MATFileLoggingOptions]
       EpisodeFinishedFcn: []
     AgentStepFinishedFcn: []
    AgentLearnFinishedFcn: []

Specify options to log data such as the logging directory and the frequency (in number of episodes) at which the data logger writes data to disk. This step is optional.

% Specify a logging directory. You must have write access for this
% directory.
logDir = fullfile(pwd,"myDataLog");
fileLogger.LoggingOptions.LoggingDirectory = logDir;

% Specify a naming rule for files. The naming rule episode<id> saves files
% as episode001.mat, episode002.mat and so on.
fileLogger.LoggingOptions.FileNameRule = "episode<id>";

% Set the frequency (in number of episodes) at which the data logger writes data to disk
fileLogger.LoggingOptions.DataWriteFrequency = 1;

Configure Data Logging

Training data of interest is generated at different stages of training, for example, experience data is available after the completion of an episode. Configure the logger object with callback functions to log at these stages. The callback functions are:

  • EpisodeFinishedFcn - callback function to log data such as experiences, logged Simulink signals, or initial observation. The function is executed after the completion of a training episode. A template for the function is shown below.

function dataToLog = myEpisodeLoggingFcn(data)
% data is a structure that contains the following fields:
% EpisodeCount: The current episode number.
% Environment: Environment object.
% Agent: Agent object.
% Experience: A structure containing the experiences from the current episode.
% EpisodeInfo: A structure containing the fields CumulativeReward, StepsTaken, and InitialObservation.
% SimulationInfo: A Simulink.SimulationOutput object containing logged signals in Simulink environments.
%
% dataToLog is a structure containing the data to be logged to disk.

% Write your code to log data to disk. For example, 
% dataToLog.Experience = data.Experience;

end
  • AgentStepFinishedFcn - callback function to log data such as the state of exploration. The function is executed after the completion of an agent step within an episode. A template for the function is shown below.

function dataToLog = myAgentStepLoggingFcn(data)
% data is a structure that contains the following fields:
% EpisodeCount: The current episode number.
% AgentStepCount: The cumulative number of steps taken by the agent.
% SimulationTime: The current simulation time in the environment.
% Agent: Agent object.
%
% dataToLog is a structure containing the data to be logged to disk.

% Write your code to log data to disk. For example, 
% noiseState = getState(getExplorationPolicy(data.Agent));
% dataToLog.noiseState = noiseState;

end
  • AgentLearnFinishedFcn - callback function to log data such as the actor and critic training losses after the completion of the learn subroutine. A template for the function is shown below.

function dataToLog = myAgentLearnLoggingFcn(data)
% data is a structure that contains the following fields:
% EpisodeCount: The current episode number.
% AgentStepCount: The cumulative number of steps taken by the agent.
% AgentLearnCount: The cumulative number of learning steps taken by the agent.
% EnvModelTrainingInfo: A structure containing the fields TransitionFcnLoss, RewardFcnLos, IsDoneFcnLoss. This is applicable for model-based agent training.
% Agent: Agent object.
% ActorLoss: Training loss of actor function.
% Agent: Training loss of critic function.
%
% dataToLog is a structure containing the data to be logged to disk.

% Write your code to log data to disk. For example, 
% dataToLog.ActorLoss = data.ActorLoss;

end

For examples on logging functions, see rlDataLogger.

For this example, configure only the AgentLearnFinishedFcn callback. The function logTrainingLoss logs the actor and critic training losses and is provided at the end of this Script.

fileLogger.AgentLearnFinishedFcn = @logTrainingLoss;

Run Training

Create a predefined CartPole-continuous environment and a deep deterministic policy gradient (DDPG) agent for training.

% Set the random seed
rng(0);

% Create a CartPole-continuous environment
env = rlPredefinedEnv("CartPole-continuous");

% Create a DDPG agent
agent = rlDDPGAgent(getObservationInfo(env), getActionInfo(env));

Specify training options to train the agent for 100 episodes without visualization in the Episode Manager.

trainOpts = rlTrainingOptions(MaxEpisodes=100, Plots="none", Verbose=true);

Train the agent using the train function and specifying the fileLogger object in the Logger name-value option.

result = train(agent, env, trainOpts, Logger=fileLogger);
Episode:   1/100 | Episode reward:    -4.25 | Episode steps:   47 | Average reward:    -4.25 | Step Count:   47 | Episode Q0:     0.00
Episode:   2/100 | Episode reward:   -20.08 | Episode steps:   31 | Average reward:   -12.17 | Step Count:   78 | Episode Q0:     2.32
Episode:   3/100 | Episode reward:   -40.08 | Episode steps:   11 | Average reward:   -21.47 | Step Count:   89 | Episode Q0:    -5.40
Episode:   4/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -26.62 | Step Count:   98 | Episode Q0:     1.72
Episode:   5/100 | Episode reward:   -40.12 | Episode steps:   11 | Average reward:   -29.32 | Step Count:  109 | Episode Q0:     4.61
Episode:   6/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -36.88 | Step Count:  118 | Episode Q0:     5.31
Episode:   7/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.28 | Step Count:  127 | Episode Q0:     2.58
Episode:   8/100 | Episode reward:   -42.06 | Episode steps:    9 | Average reward:   -41.68 | Step Count:  136 | Episode Q0:    -2.33
Episode:   9/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.48 | Step Count:  146 | Episode Q0:     2.98
Episode:  10/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.87 | Step Count:  155 | Episode Q0:     4.27
Episode:  11/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.87 | Step Count:  164 | Episode Q0:     2.03
Episode:  12/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.87 | Step Count:  173 | Episode Q0:     3.59
Episode:  13/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  183 | Episode Q0:     2.84
Episode:  14/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  193 | Episode Q0:     2.30
Episode:  15/100 | Episode reward:   -43.04 | Episode steps:    8 | Average reward:   -41.88 | Step Count:  201 | Episode Q0:     2.28
Episode:  16/100 | Episode reward:   -41.09 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  211 | Episode Q0:     1.58
Episode:  17/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.49 | Step Count:  221 | Episode Q0:     1.81
Episode:  18/100 | Episode reward:   -41.09 | Episode steps:   10 | Average reward:   -41.49 | Step Count:  231 | Episode Q0:     2.14
Episode:  19/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.68 | Step Count:  240 | Episode Q0:     0.97
Episode:  20/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.29 | Step Count:  250 | Episode Q0:     1.49
Episode:  21/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.48 | Step Count:  259 | Episode Q0:     2.44
Episode:  22/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.68 | Step Count:  268 | Episode Q0:     1.73
Episode:  23/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.87 | Step Count:  277 | Episode Q0:     3.64
Episode:  24/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.87 | Step Count:  286 | Episode Q0:     2.80
Episode:  25/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.87 | Step Count:  296 | Episode Q0:     2.30
Episode:  26/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.87 | Step Count:  305 | Episode Q0:     2.42
Episode:  27/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  315 | Episode Q0:     3.00
Episode:  28/100 | Episode reward:   -41.09 | Episode steps:   10 | Average reward:   -41.48 | Step Count:  325 | Episode Q0:     3.10
Episode:  29/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.29 | Step Count:  335 | Episode Q0:     2.59
Episode:  30/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.29 | Step Count:  345 | Episode Q0:     2.88
Episode:  31/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.09 | Step Count:  355 | Episode Q0:     3.47
Episode:  32/100 | Episode reward:   -40.13 | Episode steps:   11 | Average reward:   -40.90 | Step Count:  366 | Episode Q0:     3.33
Episode:  33/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.10 | Step Count:  375 | Episode Q0:     2.79
Episode:  34/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.29 | Step Count:  384 | Episode Q0:     3.33
Episode:  35/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.49 | Step Count:  393 | Episode Q0:     3.11
Episode:  36/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.68 | Step Count:  402 | Episode Q0:     3.33
Episode:  37/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -42.07 | Step Count:  411 | Episode Q0:     3.59
Episode:  38/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -42.07 | Step Count:  420 | Episode Q0:     3.51
Episode:  39/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -42.07 | Step Count:  429 | Episode Q0:     3.05
Episode:  40/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.87 | Step Count:  439 | Episode Q0:     2.98
Episode:  41/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  449 | Episode Q0:     4.15
Episode:  42/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.68 | Step Count:  458 | Episode Q0:     3.29
Episode:  43/100 | Episode reward:   -41.09 | Episode steps:   10 | Average reward:   -41.49 | Step Count:  468 | Episode Q0:     3.97
Episode:  44/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.29 | Step Count:  478 | Episode Q0:     4.76
Episode:  45/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.48 | Step Count:  487 | Episode Q0:     4.32
Episode:  46/100 | Episode reward:   -41.09 | Episode steps:   10 | Average reward:   -41.48 | Step Count:  497 | Episode Q0:     3.71
Episode:  47/100 | Episode reward:   -42.06 | Episode steps:    9 | Average reward:   -41.48 | Step Count:  506 | Episode Q0:     4.10
Episode:  48/100 | Episode reward:   -42.06 | Episode steps:    9 | Average reward:   -41.68 | Step Count:  515 | Episode Q0:     4.84
Episode:  49/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  525 | Episode Q0:     4.29
Episode:  50/100 | Episode reward:   -43.05 | Episode steps:    8 | Average reward:   -41.87 | Step Count:  533 | Episode Q0:     3.69
Episode:  51/100 | Episode reward:   -40.13 | Episode steps:   11 | Average reward:   -41.68 | Step Count:  544 | Episode Q0:     4.69
Episode:  52/100 | Episode reward:   -41.09 | Episode steps:   10 | Average reward:   -41.49 | Step Count:  554 | Episode Q0:     5.02
Episode:  53/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.29 | Step Count:  564 | Episode Q0:     5.54
Episode:  54/100 | Episode reward:   -43.04 | Episode steps:    8 | Average reward:   -41.68 | Step Count:  572 | Episode Q0:     3.47
Episode:  55/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.29 | Step Count:  582 | Episode Q0:     5.94
Episode:  56/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.49 | Step Count:  592 | Episode Q0:     4.76
Episode:  57/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.68 | Step Count:  601 | Episode Q0:     4.95
Episode:  58/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  611 | Episode Q0:     5.17
Episode:  59/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.29 | Step Count:  621 | Episode Q0:     6.07
Episode:  60/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.29 | Step Count:  631 | Episode Q0:     5.55
Episode:  61/100 | Episode reward:   -43.05 | Episode steps:    8 | Average reward:   -41.68 | Step Count:  639 | Episode Q0:     5.69
Episode:  62/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.49 | Step Count:  649 | Episode Q0:     5.44
Episode:  63/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.49 | Step Count:  659 | Episode Q0:     4.84
Episode:  64/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.68 | Step Count:  668 | Episode Q0:     5.63
Episode:  65/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  678 | Episode Q0:     5.73
Episode:  66/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.29 | Step Count:  688 | Episode Q0:     5.52
Episode:  67/100 | Episode reward:   -43.05 | Episode steps:    8 | Average reward:   -41.68 | Step Count:  696 | Episode Q0:     5.25
Episode:  68/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  706 | Episode Q0:     6.19
Episode:  69/100 | Episode reward:   -40.13 | Episode steps:   11 | Average reward:   -41.29 | Step Count:  717 | Episode Q0:     5.83
Episode:  70/100 | Episode reward:   -43.05 | Episode steps:    8 | Average reward:   -41.68 | Step Count:  725 | Episode Q0:     5.27
Episode:  71/100 | Episode reward:   -43.05 | Episode steps:    8 | Average reward:   -42.07 | Step Count:  733 | Episode Q0:     5.01
Episode:  72/100 | Episode reward:   -41.09 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  743 | Episode Q0:     6.02
Episode:  73/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  753 | Episode Q0:     5.37
Episode:  74/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -42.07 | Step Count:  762 | Episode Q0:     5.07
Episode:  75/100 | Episode reward:   -41.09 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  772 | Episode Q0:     5.70
Episode:  76/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.48 | Step Count:  781 | Episode Q0:     5.22
Episode:  77/100 | Episode reward:   -41.09 | Episode steps:   10 | Average reward:   -41.48 | Step Count:  791 | Episode Q0:     5.22
Episode:  78/100 | Episode reward:   -43.05 | Episode steps:    8 | Average reward:   -41.87 | Step Count:  799 | Episode Q0:     4.63
Episode:  79/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.87 | Step Count:  808 | Episode Q0:     6.37
Episode:  80/100 | Episode reward:   -42.06 | Episode steps:    9 | Average reward:   -42.07 | Step Count:  817 | Episode Q0:     5.16
Episode:  81/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -42.07 | Step Count:  826 | Episode Q0:     5.55
Episode:  82/100 | Episode reward:   -43.05 | Episode steps:    8 | Average reward:   -42.46 | Step Count:  834 | Episode Q0:     4.89
Episode:  83/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -42.26 | Step Count:  843 | Episode Q0:     5.00
Episode:  84/100 | Episode reward:   -40.12 | Episode steps:   11 | Average reward:   -41.87 | Step Count:  854 | Episode Q0:     6.36
Episode:  85/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  864 | Episode Q0:     5.94
Episode:  86/100 | Episode reward:   -42.06 | Episode steps:    9 | Average reward:   -41.68 | Step Count:  873 | Episode Q0:     5.06
Episode:  87/100 | Episode reward:   -40.13 | Episode steps:   11 | Average reward:   -41.09 | Step Count:  884 | Episode Q0:     6.56
Episode:  88/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -40.90 | Step Count:  894 | Episode Q0:     5.46
Episode:  89/100 | Episode reward:   -41.09 | Episode steps:   10 | Average reward:   -41.10 | Step Count:  904 | Episode Q0:     6.33
Episode:  90/100 | Episode reward:   -42.06 | Episode steps:    9 | Average reward:   -41.29 | Step Count:  913 | Episode Q0:     4.91
Episode:  91/100 | Episode reward:   -41.09 | Episode steps:   10 | Average reward:   -41.09 | Step Count:  923 | Episode Q0:     5.40
Episode:  92/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.48 | Step Count:  932 | Episode Q0:     5.60
Episode:  93/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.48 | Step Count:  942 | Episode Q0:     6.56
Episode:  94/100 | Episode reward:   -43.05 | Episode steps:    8 | Average reward:   -41.87 | Step Count:  950 | Episode Q0:     4.98
Episode:  95/100 | Episode reward:   -41.09 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  960 | Episode Q0:     6.07
Episode:  96/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.68 | Step Count:  970 | Episode Q0:     6.16
Episode:  97/100 | Episode reward:   -41.10 | Episode steps:   10 | Average reward:   -41.49 | Step Count:  980 | Episode Q0:     5.88
Episode:  98/100 | Episode reward:   -42.07 | Episode steps:    9 | Average reward:   -41.68 | Step Count:  989 | Episode Q0:     5.06
Episode:  99/100 | Episode reward:   -43.05 | Episode steps:    8 | Average reward:   -41.68 | Step Count:  997 | Episode Q0:     5.24
Episode: 100/100 | Episode reward:   -43.05 | Episode steps:    8 | Average reward:   -42.07 | Step Count: 1005 | Episode Q0:     4.65

The logged data is saved within the directory specified by logDir.

Analyze Logged Data

To view a summary of the generated MAT files, run the following code.

dirInfo = dir(logDir)
dirInfo=102×1 struct array with fields:
    name
    folder
    date
    bytes
    isdir
    datenum

You can import the data into the MATLAB® workspace by loading the MAT files individually, or by writing a script.

For this example, import the data using a FileDatastore object. This object loads data using the read function loadTrainingLoss provided at the end of this script. For more information, see fileDataStore.

% Create a FileDatastore to load data from the MAT files
ds = fileDatastore(logDir, ReadFcn=@loadTrainingLoss, UniformRead=true);
% Read the data
loss = readall(ds);

Plot the training losses.

% Create a figure
f = figure();

% Plot the actor loss
actorLossAx = subplot(2,1,1);
plot(actorLossAx, loss(:,1));
title(actorLossAx,"Actor Loss");
xlabel(actorLossAx,"Learning steps");
ylabel(actorLossAx,"Loss");

% Plot the critic loss
criticLossAx = subplot(2,1,2);
plot(criticLossAx, loss(:,2));
title(criticLossAx,"Critic Loss");
xlabel(criticLossAx,"Learning steps");
ylabel(criticLossAx,"Loss");

Figure contains 2 axes objects. Axes object 1 with title Actor Loss contains an object of type line. Axes object 2 with title Critic Loss contains an object of type line.

Local Functions

function dataToLog = logTrainingLoss(data)
% Function to log the actor and critic training losses.
dataToLog.ActorLoss = data.ActorLoss;
dataToLog.CriticLoss = data.CriticLoss;
end

function data = loadTrainingLoss(filename)
% Function to load data from MAT files generated by rlDataLogger
% Load the agentLearnData variable from the MAT file
load(filename,"agentLearnData");
% Extract the losses
actorLoss = vertcat(agentLearnData.ActorLoss{:});
criticLoss = vertcat(agentLearnData.CriticLoss{:});
% data is a column vector containing actor and critic losses.
data = [actorLoss, criticLoss];
end