- Calculating the Levenshtein distance is not hard, so this is not the problem.
- Creating a sliding window is also not hard, a loop would do it easily.
- You mention making insertions and deletions so that the strings match, but do you not describe any restrictions on these... so without any restriction, you can insert/delete as you desire until the strings match. So far this seems to be equivalent to your current description:
Levenshtein distance: String scanning
18 ビュー (過去 30 日間)
古いコメントを表示
I'm in desperate need of help with creating a Levenshtein distance function with scanning behavior.
Basically there is a target string (s) which is supposed to scan a longer experimental string (s2). For example a target string (s) of length m = 3 should be compared to the first 3 entries of an experimental string (s2). Afterwards the target string (s) should be "moved" by 1 step to be compared to the entries 2-4 of the experimental string (s2) and so on.
Creating such a scanning behavior is not a problem, the real problems are insertions and deletions. After an insertion or deletion I still want to compare 2 strings of equal length. Therefore when inserting another entry, the last entry of the experimental string (s2) has to be removed, and when deleting an entry the next value of the experimental string should be added. Simply spoken the string lengths (m and n) should always be constant and have the same length.
Here a simple example:
For the strings s = [1 2 3] and s2 = [1 3 2 3 ...] I want to compare parts of the same length m = 3, therefore the first comparison should be between s = [1 2 3] and s2 = [1 3 2]. A deletion at position #2 should now change s2 to [entry1 entry3 entry4], in this case to s2 = [1 2 3], leading to an edit distance of 1 (with a deletion cost of 1). Here, an insertion of the value 2 at position #2 should lead to s2 = [entry1 insertion entry2] and therefore also s2 = [1 2 3].
Any ideas how I could achieve this?
I'm using an Levenshtein function similar to this:
% Variables
s = [1 2 3];
m=length(s);
s2 = [1 3 2 3 1 2 3];
t = s2(1:1:m);
n=length(t);
% Edit Matrix
mat=zeros(m+1,n+1);
for i=1:1:m
mat(i+1,1)=i;
end
for j=1:1:n
mat(1,j+1)=j;
end
for i=1:m
for j=1:n
if (s(i) == t(j))
mat(i+1,j+1)=mat(i,j);
else
mat(i+1,j+1)= min([mat(i+1,j)+1,mat(i,j+1)+1,mat(i,j)+1]);
end
end
end
%Edit distance
d = mat(m+1,n+1);
5 件のコメント
回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で Get Started with MATLAB についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!