Error in creating a custom environment in deep reinforcement learning code

Question

Nour 2023 年 4 月 15 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1947788-error-in-creating-a-custom-environment-in-deep-reinforcement-learning-code

回答済み: Emmanouil Tzorakoleftherakis 2023 年 4 月 24 日

I am creating a movie recommender system using deep reinforcement learning. I faced an error while creating the environment for the RL problem. The problem is I couldn't find any generic example on how to create environments for RL in MATLAB, all the environments are predefined. Please assist as I keep getting the following error:

Error using rl.env.MATLABEnvironment/validateEnvironment

Environment 'ObservationInfo' does not match observation output from step function. Check the data type,

dimensions, and range.

Error in rl.env.rlFunctionEnv (line 74)

validateEnvironment(this);

Error in rlFunctionEnv (line 45)

env = rl.env.rlFunctionEnv(varargin{:});

Error in untitled9 (line 22)

env = rlFunctionEnv(observationInfo, actionInfo, @(Action,LoggedSignals) myStepFunction(Action,LoggedSignals,ratings), @myResetFunction);

I have written the following code so far.

clear all
% Load the MovieLens dataset
ratings = readtable('ml-latest-small/ratings.csv', 'VariableNamingRule', 'preserve');
opts = detectImportOptions('ml-latest-small/movies.csv');
movies = readtable('ml-latest-small/movies.csv', opts);
% Preprocess the data to create the state space and reward function
numMovies = height(movies); % number of movies
numGenres = 20; % number of movie genres
numRatings = 5; % number of possible movie ratings
numUsers = max(ratings.userId); % number of users
stateSize = numMovies + numGenres + numRatings + numUsers;
observationInfo = rlNumericSpec([stateSize 1]);
observationInfo.Name = 'observation';
% Define the action space
actionInfo = rlFiniteSetSpec([1:numMovies]);
actionInfo.Name = 'action';
% Define the environment
env = rlFunctionEnv(observationInfo, actionInfo, @(state,action) myStepFunction(state,action,ratings), @myResetFunction);
% Define the DQN agent
numHiddenUnits = 64;
statePath = [    imageInputLayer([stateSize 1 1],'Normalization','none','Name','observation')
    fullyConnectedLayer(numHiddenUnits,'Name','fc1')
    reluLayer('Name','relu1')
    fullyConnectedLayer(numHiddenUnits,'Name','fc2')
    reluLayer('Name','relu2')
    fullyConnectedLayer(numMovies,'Name','fc3')];
dqn = rlDQNAgent(statePath,actionInfo,'UseDoubleDQN',true);
% Train the agent
maxEpisodes = 100;
maxSteps = 10;
trainOpts = rlTrainingOptions('MaxEpisodes',maxEpisodes,'MaxStepsPerEpisode',maxSteps);
trainingStats = train(dqn,env,trainOpts);
function next_state = myStepFunction(state, action, ratings)
% This function takes the current state, an action, and a matrix of ratings
% as input and returns the next state.
% Calculate the new state based on the action
new_state = [state(2:end); action];
% Return the new state
next_state = new_state;
end
function [initial_state, LoggedSignal] = myResetFunction()
% This function returns the initial state for the movie recommendation system.
% Load the movie ratings dataset
%ratings = readmatrix('ml-latest-small/ratings.csv');
num_movies = 9742;
% Initialize the state with no movie ratings
initial_state = zeros(num_movies + 20 + 5 + 610, 1);
% Set the first element of the state to a random integer between 1 and the number of movies
LoggedSignal.State = randi(num_movies);
% Assign the random movie selection to the first element of the state vector
initial_state(1) = LoggedSignal.State;
end
function reward = myRewardFunction(state, action, ratings)
% This function takes the current state, an action, and a matrix of ratings
% as input and returns the reward.
% Calculate the reward based on the ratings
reward = ratings(state(1), action);
end