Why do the partial dependence plots I code myself not match the plots from the Matlab "plotPartialDependence" function?

4 ビュー (過去 30 日間)
In running a random forest model for a sample situation, I have not been able to replicate the partial dependence plots produced by the "plotPartialDependence" function when I try to code the partial dependence plot myself.
I start with a very simple system X1 - X2 and some noise, then run the random forest model. I then substitute the average value for the second variable (X2) for all of the rows in X2 and get the predictions. In theory, both of the lines should match up, but they are always offset by a small amount (although they always have the same shape).
I have tried this for several different sample equations and it always comes out the same. Any ideas what might be causing the offset?
range = [0:0.01:25]'; % Range of numbers
constant1 = 2; % First constant
constant2= 3.5; % Second constant
X(:,1) = range./(constant1+range); % First equation
X(:,2) = range./(constant2+range); % Second equation
for i = 1:size(range,1)
rng(i,'twister') % For reproducability
Y(i,1) = (X(i,1) - X(i,2)) + 0.1*rand(1,1); % Response variable
end
% Run the random forest model
Mdl_rf = TreeBagger(500,X,Y,'OOBPredictorImportance','on','PredictorSelection','interaction-curvature','Method','regression');
X(:,2) = mean(X(:,2)); % Substitute the mean value of Column 2 for all rows in Column 2
predictions_rf = predict(Mdl_rf,X); % Get the predictions based on the new data
figure
plotPartialDependence(Mdl_rf,1) % Plot using the partial dependence function
hold on
scatter(X(:,1),predictions_rf) % Plot the values for the first variable against the new predictions

回答 (0 件)

カテゴリ

Help Center および File ExchangeDiscrete Data Plots についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by