This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

knnsearch

Find nearest neighbors by edit distance

Syntax

idx = knnsearch(eds,words)
[idx,d] = knnsearch(eds,words)
[idx,d] = knnsearch(eds,words,Name,Value)

Description

example

idx = knnsearch(eds,words) finds the indices of the nearest neighbors in the edit distance searcher eds to each element in words.

example

[idx,d] = knnsearch(eds,words) also returns the edit distances between the elements of words and the nearest neighbors.

example

[idx,d] = knnsearch(eds,words,Name,Value) specifies additional options using one or more name-value pair arguments.

Examples

collapse all

Create an edit distance searcher.

vocabulary = ["MathWorks" "MATLAB" "Simulink"];
eds = editDistanceSearcher(vocabulary,2);

Find the nearest words to "MALTAB" and "MatWorks".

words = ["MALTAB" "MatWorks"];
idx = knnsearch(eds,words)
idx = 2×1

     2
     1

Get the words from the vocabulary using the returned indices.

nearestWords = eds.Vocabulary(idx)
nearestWords = 1x2 string array
    "MATLAB"    "MathWorks"

Create an edit distance searcher.

vocabulary = ["MATLAB" "Simulink" "MathWorks"];
eds = editDistanceSearcher(vocabulary,2);

Find the nearest words and their edit distances to "MatWorks" and "MALTAB".

words = ["MatWorks" "MALTAB"];
[idx,d] = knnsearch(eds,words)
idx = 2×1

     3
     1

d = 2×1

     1
     2

Get the words from the vocabulary using the returned indices.

nearestWords = eds.Vocabulary(idx)
nearestWords = 1x2 string array
    "MathWorks"    "MATLAB"

Changing the word "MatWorks" to "MathWorks" requires one edit: an insertion. Changing the word "MALTAB" into "MATLAB" requires two edits: a deletion and an insertion.

Create an edit distance searcher.

vocabulary = ["MathWorks" "MATLAB" "Analytics"];
eds = editDistanceSearcher(vocabulary,5);

Find the two nearest words and their edit distances to "Math" and "Analysis".

words = ["Math" "Analysis"];
idx = knnsearch(eds,words,'K',2)
idx = 2×2

     1     2
     3   NaN

View the two closest words to "Math".

idxMath = idx(1,:);
newWords = eds.Vocabulary(idxMath)
newWords = 1x2 string array
    "MathWorks"    "MATLAB"

There is only one word within the maximum edit distance from "Analysis", so the function returns NaN for the other indices. View the nearest words with valid indices.

idxAnalysis = idx(2,:);
idxAnalysis(isnan(idxAnalysis)) = [];
newWords = eds.Vocabulary(idxAnalysis)
newWords = 
"Analytics"

Input Arguments

collapse all

Edit distance searcher, specified as an editDistanceSearcher object.

Input words, specified as a string vector, character vector, or cell array of character vectors. If you specify words as a character vector, then the function treats the argument as a single word.

Data Types: string | char | cell

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: knnsearch(eds,words,'K',3) finds the nearest three neighbors in eds to the elements of words.

Number of nearest neighbors to find for each element in words, specified as a positive integer.

Example: 'K',3

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Option to return neighbors whose distance values are equal, specified as true or false.

If 'IncludeTies' is false, then the function returns the K neighbors with the shortest edit distance, where K is the number of neighbors to find. In this case, the function outputs N-by-K matrices, where N is the number of input words. To specify K, use the 'K' name-value pair argument.

If 'IncludeTies' is true, then the function also returns the neighbors whose distances are equal to the Kth smallest distance in the output. In this case, the function outputs cell arrays of size N-by-1, where N is the number of input words. The elements of the cell arrays are vectors with at least K elements. The function sorts the neighbors in each vector in ascending order of distance.

Example: 'IncludeTies',true

Data Types: logical

Output Arguments

collapse all

Indices of nearest neighbors in the searcher, returned as a matrix or a cell array of vectors.

If 'IncludeTies' is false, then the function returns the K neighbors with the shortest edit distance, where K is the number of neighbors to find. In this case, the function outputs N-by-K matrices, where N is the number of input words. To specify K, use the 'K' name-value pair argument.

If 'IncludeTies' is true, then the function also returns the neighbors whose distances are equal to the Kth smallest distance in the output. In this case, the function outputs cell arrays of size N-by-1, where N is the number of input words. The elements of the cell arrays are vectors with at least K elements. The function sorts the neighbors in each vector in ascending order of distance.

Data Types: double | cell

Edit distances to neighbors, returned as a matrix or a cell array of vectors.

If 'IncludeTies' is false, then the function returns the K neighbors with the shortest edit distance, where K is the number of neighbors to find. In this case, the function outputs N-by-K matrices, where N is the number of input words. To specify K, use the 'K' name-value pair argument.

If 'IncludeTies' is true, then the function also returns the neighbors whose distances are equal to the Kth smallest distance in the output. In this case, the function outputs cell arrays of size N-by-1, where N is the number of input words. The elements of the cell arrays are vectors with at least K elements. The function sorts the neighbors in each vector in ascending order of distance.

Data Types: double | cell

Introduced in R2019a