Comparing elemental composition with databank

1 回表示 (過去 30 日間)
Tatjana Mü
Tatjana Mü 2023 年 10 月 20 日
コメント済み: Jon 2023 年 10 月 24 日
I have a string array 22x1 (Probe_RL), which contains elemental compositions. Next to it I have a database 13x2 (Datenbank). The composition of both should be compared. If there is a fit, the group should be added to the line in Probe_RL. The problem is, that the factors should have a tolerance of 0.2. So that for example 1Fe 0.9Si can fit to group 206 in the database. I tried the following and hope somebody can help me:
% Sample data
Probe_RL = {'1Fe', '1Fe', '1Fe,1Si', '1Fe', '0.35Si,1Fe', '0.19Fe,0.57K,1Si', '0.47Fe,0.68K,1Si', '1Si', '0.81Fe,1Si', '1Fe', '1Si', '1Fe', '0.44Fe,1Si', '1Na', '1Fe', '1Si', '0.43Fe,1Si', '0.6Si,1Fe', '1Fe', '0.15Fe,1Si', '1Fe', '1Fe'};
Datenbank = {'1Na,1U Group 204', '1Bi,1V Group 205', '1Fe,1Si Group 206', '0.11Si,1Mg Group 207', '0.83Mg,1Si Group 208', '0.1Ca,0.2Al,1Si Group 209', '0.83Ca,1Si Group 210', '0.67Ca,1Al,1Si Group 211', '0.5Co,1Na Group 212', '0.5Pb,1Co Group 213', '0.5Na,1Si Group 214', '1Al,1Ca Group 215', '0.5Co,1Ni Group 216'};
Datenbank = fliplr(Datenbank);
Probe_RL = fliplr(Probe_RL);
% Probenzusammensetzung in zwei Spalten teilen
[Probe_RL_2D, ~] = cellfun(@strsplit, Probe_RL, repmat({","}, length(Probe_RL), 1), 'UniformOutput', false);
% Probenummern hinzufügen
for i = 1:length(Probe_RL)
Probe_RL_2D{i,2} = num2str(i);
end
% Proben mit der Datenbank vergleichen
for i = 1:length(Probe_RL)
if ~isempty(Datenbank)
% Proben mit der Datenbank vergleichen
if numel(Probe_RL_2D{i,1:2}) > 1
Probe_RL_2D{i,1:2} = Probe_RL_2D{i,1:2}(1);
end
% Konvertiere den Zeichenkettenvektor in einen Zellvektor
Probe_RL_2D{i,1:2} = cellstr(char(''));
Probe_RL_2D{i,1:2} = strjoin(Probe_RL_2D{i,1:2}, "");
% Datenbank und Probe_RL vertikal anordnen
Datenbank = fliplr(Datenbank);
Probe_RL_2D = fliplr(Probe_RL_2D);
j = find(abs(str2double(Probe_RL_2D{i,1:2}) - str2double(Datenbank(:,1:2))) < 0.2);
if ~isempty(j)
% Übereinstimmende Mineralien überprüfen
if length(j) > 1
Probe_RL(i,2) = "Unbekannt";
else
Probe_RL(i,2) = Datenbank(j(1),2);
end
else
Probe_RL(i,2) = "Unbekannt";
end
else
Probe_RL(i,2) = "Unbekannt";
end
end
% Ergebnisse ausgeben
disp(Probe_RL)
  3 件のコメント
Tatjana Mü
Tatjana Mü 2023 年 10 月 20 日
I am sorry. For example in line 3 and 4, group 206 should be set into the second column. So yes correct they should be labeled and if no mineral from the database is compatibel, then there should be written not identified (sorry it is written here in german "unbekannt". Like you can see, in front of the elements is a factor and these should have a tolerance of 0.1.
Jon
Jon 2023 年 10 月 20 日
Wouldn't the 4th entry in Probe_RL be "unbekannt" as there is no pure Fe in the data base and the 4th entry in Probe_RL has Fe as it's only component?

サインインしてコメントする。

採用された回答

Jon
Jon 2023 年 10 月 20 日
編集済み: Jon 2023 年 10 月 20 日
Here is one approach, in which I put the data into tables and then use ismembertol to to do the comparision with a tolerance. I denote the group number for the samples with a numeric value of the group number (rather than a string) and if no match is found I have the group number as NaN rather than "unbekannt". You could modify this if it is important. To me it seemed better to maintain the group number, which is fundamentally a number not a string as a numerical value.
% Sample data
Probe_RL = {'1Fe', '1Fe', '1Fe,1Si', '1Fe', '0.35Si,1Fe', '0.19Fe,0.57K,1Si', '0.47Fe,0.68K,1Si', '1Si', '0.81Fe,1Si', '1Fe', '1Si', '1Fe', '0.44Fe,1Si', '1Na', '1Fe', '1Si', '0.43Fe,1Si', '0.6Si,1Fe', '1Fe', '0.15Fe,1Si', '1Fe', '1Fe'};
Datenbank = {'1Na,1U Group 204', '1Bi,1V Group 205', '1Fe,1Si Group 206', '0.11Si,1Mg Group 207', '0.83Mg,1Si Group 208', '0.1Ca,0.2Al,1Si Group 209', '0.83Ca,1Si Group 210', '0.67Ca,1Al,1Si Group 211', '0.5Co,1Na Group 212', '0.5Pb,1Co Group 213', '0.5Na,1Si Group 214', '1Al,1Ca Group 215', '0.5Co,1Ni Group 216'};
% Parameters
tol = 0.2; % tolerance for matching elements;
% Parse data and put into cell arrays
ProbeDat = cell(numel(Probe_RL),2);
for k = 1:numel(Probe_RL)
% Break up individual entries using comma delimiters
components = strsplit(Probe_RL{k},',');
% Loop through components, getting element and quantity
numComponents = numel(components);
quantity = zeros(numComponents,1); % preallocate
element = cell(numComponents,1); % preallocate
for m = 1:numel(components)
[quantity(m),element{m}] = numstr(components{m});
end
% Put into cell array
ProbeDat{k,1} = element;
ProbeDat{k,2} = quantity;
ProbeDat{k,3} = Probe_RL{k};
end
RefDat = cell(numel(Datenbank),3);
for k = 1:numel(Datenbank)
% Break up individual entries using comma and whitespace delimiters
parts = strsplit(Datenbank{k},{',',' '});
% Loop through components, getting element and quantity
numParts = numel(parts);
quantity = zeros(numParts-2,1); % preallocate
element = cell(numParts-2,1); % preallocate
for m = 1:numParts -2
[quantity(m),element{m}] = numstr(parts{m});
end
group = str2double(parts{end});
% Put into cell array
RefDat{k,1} = element;
RefDat{k,2} = quantity;
RefDat{k,3} = group;
end
% Get alphabetically sorted list of all of the possible elements
allElements = sort(unique(vertcat(ProbeDat{:,1},RefDat{:,1})));
numVbls = numel(allElements); % total number of elements
% Make tables for probe data and databank (refernce)
Probe = array2table(zeros(size(ProbeDat,1),numVbls),...
'VariableNames',allElements);
Probe.Name = cell(height(Probe),1);
Probe = Probe(:,["Name";allElements]); % Make Name the first column
DataBank = array2table(zeros(size(RefDat,1),numVbls+1),...
'VariableNames',['Group';allElements]);
% Loop through cell array assigning table row values to Probe table
% table has a column for each possible element, and entries with the
% composition. The leading column has the string name for the
% composition
for k = 1:height(Probe)
for m = 1:numel(ProbeDat{k,1})
Probe.(ProbeDat{k,1}{m})(k) = ProbeDat{k,2}(m);
end
Probe.Name{k} = ProbeDat{k,3};
end
% Loop through cell array assigning table row values to DataBank table
% table has a column for each possible element, and the entries with the
% composition, the leading column gives the group number for each
% composition
for k = 1:height(DataBank)
for m = 1:numel(RefDat{k,1})
DataBank.(RefDat{k,1}{m})(k) = RefDat{k,2}(m);
end
DataBank.Group(k) = RefDat{k,3};
end
disp(DataBank)
Group Al Bi Ca Co Fe K Mg Na Ni Pb Si U V _____ ___ __ ____ ___ __ _ ____ ___ __ ___ ____ _ _ 204 0 0 0 0 0 0 0 1 0 0 0 1 0 205 0 1 0 0 0 0 0 0 0 0 0 0 1 206 0 0 0 0 1 0 0 0 0 0 1 0 0 207 0 0 0 0 0 0 1 0 0 0 0.11 0 0 208 0 0 0 0 0 0 0.83 0 0 0 1 0 0 209 0.2 0 0.1 0 0 0 0 0 0 0 1 0 0 210 0 0 0.83 0 0 0 0 0 0 0 1 0 0 211 1 0 0.67 0 0 0 0 0 0 0 1 0 0 212 0 0 0 0.5 0 0 0 1 0 0 0 0 0 213 0 0 0 1 0 0 0 0 0 0.5 0 0 0 214 0 0 0 0 0 0 0 0.5 0 0 1 0 0 215 1 0 1 0 0 0 0 0 0 0 0 0 0 216 0 0 0 0.5 0 0 0 0 1 0 0 0 0
% Loop through rows in Probe table looking for a matching group in DataRref
% and lableling accordingly
numCol = width(Probe); % number of columns before adding label
for k = 1:height(Probe)
idl = ismembertol(DataBank{:,2:end},Probe{k,2:numCol},tol,...
'DataScale',1,'ByRows',true);
if any(idl)
% Label the group, assume just one match or none
Probe.Group(k) = DataBank.Group(idl);
else
Probe.Group(k) = NaN; % NaN for not found
end
end
% Reorder the columns so that the group match is the second column, just to
% make it be more readable
Probe = Probe(:,['Name';'Group';allElements]);
disp(Probe)
Name Group Al Bi Ca Co Fe K Mg Na Ni Pb Si U V ____________________ _____ __ __ __ __ ____ ____ __ __ __ __ ____ _ _ {'1Fe' } NaN 0 0 0 0 1 0 0 0 0 0 0 0 0 {'1Fe' } NaN 0 0 0 0 1 0 0 0 0 0 0 0 0 {'1Fe,1Si' } 206 0 0 0 0 1 0 0 0 0 0 1 0 0 {'1Fe' } NaN 0 0 0 0 1 0 0 0 0 0 0 0 0 {'0.35Si,1Fe' } NaN 0 0 0 0 1 0 0 0 0 0 0.35 0 0 {'0.19Fe,0.57K,1Si'} NaN 0 0 0 0 0.19 0.57 0 0 0 0 1 0 0 {'0.47Fe,0.68K,1Si'} NaN 0 0 0 0 0.47 0.68 0 0 0 0 1 0 0 {'1Si' } 209 0 0 0 0 0 0 0 0 0 0 1 0 0 {'0.81Fe,1Si' } 206 0 0 0 0 0.81 0 0 0 0 0 1 0 0 {'1Fe' } NaN 0 0 0 0 1 0 0 0 0 0 0 0 0 {'1Si' } 209 0 0 0 0 0 0 0 0 0 0 1 0 0 {'1Fe' } NaN 0 0 0 0 1 0 0 0 0 0 0 0 0 {'0.44Fe,1Si' } NaN 0 0 0 0 0.44 0 0 0 0 0 1 0 0 {'1Na' } NaN 0 0 0 0 0 0 0 1 0 0 0 0 0 {'1Fe' } NaN 0 0 0 0 1 0 0 0 0 0 0 0 0 {'1Si' } 209 0 0 0 0 0 0 0 0 0 0 1 0 0 {'0.43Fe,1Si' } NaN 0 0 0 0 0.43 0 0 0 0 0 1 0 0 {'0.6Si,1Fe' } NaN 0 0 0 0 1 0 0 0 0 0 0.6 0 0 {'1Fe' } NaN 0 0 0 0 1 0 0 0 0 0 0 0 0 {'0.15Fe,1Si' } 209 0 0 0 0 0.15 0 0 0 0 0 1 0 0 {'1Fe' } NaN 0 0 0 0 1 0 0 0 0 0 0 0 0 {'1Fe' } NaN 0 0 0 0 1 0 0 0 0 0 0 0 0
% Make summary table with just names and group numbers, NaN means no match
% found, i.e. "unbekannt"
summary = Probe(:,{'Name','Group'})
summary = 22×2 table
Name Group ____________________ _____ {'1Fe' } NaN {'1Fe' } NaN {'1Fe,1Si' } 206 {'1Fe' } NaN {'0.35Si,1Fe' } NaN {'0.19Fe,0.57K,1Si'} NaN {'0.47Fe,0.68K,1Si'} NaN {'1Si' } 209 {'0.81Fe,1Si' } 206 {'1Fe' } NaN {'1Si' } 209 {'1Fe' } NaN {'0.44Fe,1Si' } NaN {'1Na' } NaN {'1Fe' } NaN {'1Si' } 209
%
function [num,str] = numstr(s)
% splits string s into numeric and character parts
isNum = isstrprop(s,"digit")| isstrprop(s,"punct"); % digit or decimal point
num = str2double(s(isNum));
str = s(~isNum);
end
  14 件のコメント
Tatjana Mü
Tatjana Mü 2023 年 10 月 24 日
THANK YOU SO MUCH!! I really appreciate your help. It's working perfectly
Jon
Jon 2023 年 10 月 24 日
Your welcome. If you're going to use this code alot or sharing it with others you could even further improve it. Specifically, looking at the whole code I see that you first combine the mineral and composition name, and then later in the code that I gave you I split them apart again. Could do the processing I do after splitting them apart, while they are already separate at the beginning. Also as I mentioned you could check for badly placed spaces or punctuation (, instead of .) if others will be modifying the .xlsx databank file. Otherwise if the files aren't changing and you just need it to work you should be all set with what you have. Good luck with your project.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeAVEVA PI Server Access についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by