フィルターのクリア

Error using mkdir with "lock folders" in a parfor loop

101 ビュー (過去 30 日間)
Mitchell Tillman
Mitchell Tillman 2024 年 8 月 7 日 1:57
コメント済み: Walter Roberson 2024 年 8 月 20 日 18:16
I am writing parfor loops and using folders as "lock files" for thread safety because mkdir is atomic and I believe is therefore the best option in MATLAB for a "lock file". However, I am getting the following error:
Error using makeLockFile
The system cannot find the path specified.
This error occurs on the line in makeLockFile containing
mkdir(pidPath);
Here is my code for makeLockFile (top), releaseLockFile (middle), and the helper function getMATLABPids (bottom):
What I am trying to do is:
  1. Create a "lock folder". This folder has the same path as a .mat file (except the extension of course). If it exists, that means a MATLAB instance is using the corresponding .mat file.
  2. If successful, create a subfolder named as the instance's process ID, and return. This is where the issue is!
  3. If unsuccessful, that means another process has locked this variable. In that case, remove any previous (now inactive) process's lock subfolders that were left behind due to interrupted processes (this ensures that a process that is no longer running is not the sole reason for a lock). Then try deleting the lock folder. If that fails, pause and repeat.
When I encounter the above error, I can restart the process and then it runs all the way through, so I know it's not a path correctness issue.
I don't know how to debug this any further, I've been using disp() but can't make heads or tails of why mkdir would ever fail in this context.
function [pidPath, status] = makeLockFile(lockFolderPath)
%% PURPOSE: CREATE A LOCK FOLDER, BECAUSE MKDIR IS ATOMIC.
% Clean the lockFolderPath and add the process ID.
[folderPath, folderName] = fileparts(lockFolderPath);
pid = num2str(feature('getpid')); % Get MATLAB instance's process ID.
lockFolderPath = [folderPath filesep folderName];
pidPath = [lockFolderPath filesep pid];
status=false;
while ~status
disp(['PID: ' pid ' Lock Folder: ' lockFolderPath]);
[status, message, messageID] = mkdir(lockFolderPath);
try
if ~status
% The variable is locked.
% Get the listing of all folders in the variable folder
items = dir(lockFolderPath);
folderItems = items([items.isdir]);
folderNames = {folderItems.name};
folderNames(ismember(folderNames, {'.', '..'})) = []; % Remove '.' and '..'
% Get all of the active MATLAB PID's
pids = getMATLABPids();
% Remove the inactive (leftover from previous processes) PID's
inactivePIDs = folderNames(~ismember(folderNames, pids));
for i = 1:length(inactivePIDs)
rmdir([lockFolderPath filesep inactivePIDs{i}]);
end
% Try releasing the lock folder. Has no effect if folder is not
% empty (i.e. active process folders present)
releaseLockFile(pidPath);
continue;
else
% Make the folder for this specific process ID
mkdir(pidPath);
end
catch ME
disp(['ERROR PID: ' pid ' Lock Folder Path: ' pidPath]);
disp({ME.stack.file});
disp({ME.stack.name});
disp({ME.stack.line});
rethrow(ME);
end
end
end
function [] = releaseLockFile(pidPath)
%% PURPOSE: RELEASE THE LOCK FILE.
status = rmdir(pidPath); % PID-specific folder.
pid = feature('getpid');
lockFolderPath = fileparts(pidPath); % Parent folder (the lock folder)
status = rmdir(lockFolderPath); % Remove the folder only if it is empty.
end
function [pids] = getMATLABPids()
%% PURPOSE: GET ALL OF THE PROCESS ID'S FOR ALL MATLAB INSTANCES
if ispc==1
% Execute the command to get a list of all running MATLAB processes
[status, result] = system('tasklist /FI "IMAGENAME eq matlab.exe" /FO LIST');
% Check if the command was successful
if status == 0
% Process the result to extract PIDs
lines = strsplit(result, '\n');
pids = [];
for i = 1:length(lines)
line = strtrim(lines{i});
if startsWith(line, 'PID:')
pid = extractAfter(line, 'PID:');
pids = [pids; {pid}];
end
end
else
error('Failed to execute system command. Status: %d', status);
end
else
% Execute the command to get PIDs of running MATLAB processes
[status, result] = system('pgrep matlab');
% Check if the command was successful
if status == 0
% Process the result to extract PIDs
pids = strsplit(strtrim(result));
else
error('Failed to execute system command. Status: %d', status);
end
end
end
  1 件のコメント
Walter Roberson
Walter Roberson 2024 年 8 月 7 日 5:16
The official way to handle all of this, is to use spmd instead of parfor and use labBarrier or spmdBarrier

サインインしてコメントする。

回答 (1 件)

Madheswaran
Madheswaran 2024 年 8 月 20 日 6:36
Hi Mitchell,
It seems like you're trying to implement a custom mutex solution to handle the critical section problem when accessing the same MAT file from multiple parallel processes. With my current understanding, this issue typically arises because the assignment operation might not be atomic.
If your goal is to write data to a MAT file concurrently, I recommend using “spmd” with “DataQueue”.The core idea here is to transmit data from each worker to the client using the “send” method of “DataQueue”. Once new data is added to the “DataQueue”, you can execute a specified function using the “afterEach” function.
Here's a sample code snippet explaining this approach:
numWorkers = 6;
queue = parallel.pool.DataQueue;
%creating an empty mat file
emptystruct = struct;
save('workerData.mat', '-struct', "emptystruct");
function saveDataToFile(dataWithIndex)
data = dataWithIndex{1};
workerId = dataWithIndex{2};
varName = sprintf('worker_%d', workerId);
filename = 'workerData.mat';
existingData = load(filename);
existingData.(varName) = data;
save(filename, '-struct', 'existingData');
end
afterEach(queue, @saveDataToFile);
spmd(numWorkers)
data = rand(5, 1);
dataWithIndex = {data, spmdIndex};
send(queue, dataWithIndex);
end
delete(gcp('nocreate'));
%verify the contents of mat file
matfileObj = matfile("workerData.mat");
variablesList = who(matfileObj);
disp(variablesList);
Please refer to the following links for the MathWorks documentations for the following topics:
  1. Run single programs on multiple data sets - https://mathworks.com/help/parallel-computing/execute-simultaneously-on-multiple-data-sets.html
  2. afterEach function - https://mathworks.com/help/matlab/ref/parallel.pool.dataqueue.aftereach.html
  3. save function - https://mathworks.com/help/matlab/ref/save.
  1 件のコメント
Walter Roberson
Walter Roberson 2024 年 8 月 20 日 18:16
Is there a particular reason to use spmd here instead of parfor ? Data queues can be used with parfor() as well.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangePID Controller Tuning についてさらに検索

製品


リリース

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by