how to plot accuracy?

21 ビュー (過去 30 日間)
arian hoseini
arian hoseini 2024 年 7 月 1 日
コメント済み: Taylor 2024 年 7 月 3 日
Error using trainNetwork (line 150)
Invalid training data. Sequence responses must have the same sequence length as the corresponding predictors.
Error in Untitled (line 92)
net = trainNetwork(x_train_seq, y_train_seq, layers, options);
% Load the files
file_path_e = 'sig.xlsx';
file_path_sig = 'E.xlsx';
% Read the files
data_e = readtable(file_path_e);
data_sig = readtable(file_path_sig);
% Prepare data
x = table2array(data_e);
y = table2array(data_sig);
% Ensure x and y have the same length
min_length = min(length(x), length(y));
x = x(1:min_length);
y = y(1:min_length);
% Display initial data types
disp('Initial data types:');
disp(['x type: ', class(x)]);
disp(['y type: ', class(y)]);
% Convert to numeric arrays if not already
x = str2double(x);
y = str2double(y);
% Display number of NaNs before removing them
fprintf('Number of NaNs in x before removal: %d\n', sum(isnan(x)));
fprintf('Number of NaNs in y before removal: %d\n', sum(isnan(y)));
% Handle non-numeric entries by removing NaNs
valid_indices = ~isnan(x) & ~isnan(y);
x = x(valid_indices);
y = y(valid_indices);
% Display number of valid data points after removal
fprintf('Number of valid data points after preprocessing: %d\n', length(x));
% Ensure x and y have the same length again after removing NaNs
min_length = min(length(x), length(y));
x = x(1:min_length);
y = y(1:min_length);
% Check if there are enough valid entries
if min_length <= 1
error('Not enough valid data points after preprocessing.');
end
% Reshape data to be compatible with LSTM input (samples, timesteps, features)
x = reshape(x, [], 1);
y = reshape(y, [], 1);
% Scale data using min-max normalization
x_scaled = (x - min(x)) / (max(x) - min(x));
y_scaled = (y - min(y)) / (max(y) - min(y));
% Split data into train and test sets
cv = cvpartition(length(x_scaled), 'HoldOut', 0.2);
x_train = x_scaled(training(cv));
y_train = y_scaled(training(cv));
x_test = x_scaled(test(cv));
y_test = y_scaled(test(cv));
% Create sequences for LSTM
seq_length = 10;
[x_train_seq, y_train_seq] = create_sequences(x_train, y_train, seq_length);
[x_test_seq, y_test_seq] = create_sequences(x_test, y_test, seq_length);
% Reshape for LSTM
x_train_seq = reshape(x_train_seq, [size(x_train_seq, 1), seq_length, 1]);
x_test_seq = reshape(x_test_seq, [size(x_test_seq, 1), seq_length, 1]);
% Build the LSTM model
layers = [
sequenceInputLayer(1)
lstmLayer(20, 'OutputMode', 'sequence')
dropoutLayer(0.2)
lstmLayer(20)
dropoutLayer(0.2)
fullyConnectedLayer(1)
regressionLayer];
options = trainingOptions('adam', ...
'MaxEpochs', 300, ...
'MiniBatchSize', 20, ...
'InitialLearnRate', 0.001, ...
'ValidationData', {x_test_seq, y_test_seq}, ...
'Plots', 'training-progress', ...
'Verbose', 0);
% Train the model
net = trainNetwork(x_train_seq, y_train_seq, layers, options);
% Plot loss curve
training_info = net.TrainingHistory;
figure;
plot(training_info.TrainingLoss, 'DisplayName', 'Train');
hold on;
plot(training_info.ValidationLoss, 'DisplayName', 'Validation');
title('Model loss');
xlabel('Epoch');
ylabel('Loss');
legend('show');
hold off;
% Function to create sequences
function [xs, ys] = create_sequences(x_data, y_data, seq_length)
xs = [];
ys = [];
for i = 1:(length(x_data) - seq_length)
x_seq = x_data(i:i+seq_length-1);
y_seq = y_data(i+seq_length-1); % Adjust index to ensure same length
xs = [xs; x_seq'];
ys = [ys; y_seq'];
end
end

回答 (1 件)

Taylor
Taylor 2024 年 7 月 2 日
You're using your test data as validation data. Ideally, you should have training data that is used only to train the model, validation data that is used to tune hyperparameters and prevent overfitting, and testing data that is used to evaluate the model (never used in any part of training). You can use the predict function to evaluate your test data and scores2label to convert predictions scores to labels. Then you can calculate accuracy by comparing the predicted labels to the known labels for the test data.
  2 件のコメント
arian hoseini
arian hoseini 2024 年 7 月 3 日
I don't understand... What exactly should I do,? Would u describe it in my code?
Taylor
Taylor 2024 年 7 月 3 日
In this section where you split your data:
% Split data into train and test sets
cv = cvpartition(length(x_scaled), 'HoldOut', 0.2);
x_train = x_scaled(training(cv));
y_train = y_scaled(training(cv));
x_test = x_scaled(test(cv));
y_test = y_scaled(test(cv));
you should also make a validation split. One way to accomplish this would be with the dividerand function:
trainRatio = 0.6;
valRatio = 0.2;
testRatio = 0.2;
[trainInd, valInd, testInd] = dividerand(length(x_scaled) , trainRatio, valRatio, testRatio);
x_train = x_scaled(trainInd);
y_train = y_scaled(trainInd);
x_val = x_scaled(valInd);
y_val = y_scaled(valInd);
x_test = x_scaled(testInd);
y_test = y_scaled(testInd);

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeSequence and Numeric Feature Data Workflows についてさらに検索

タグ

製品


リリース

R2016b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by