This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.


Length of documents in document array


N = doclength(documents)



N = doclength(documents) returns the number of tokens in each document in documents.


collapse all

Find the number of words in an array of tokenized documents. Erase the punctuation characters so they do not get counted as words.

str = [ ...
    "An example of a short sentence." 
    "A second short sentence."];
documents = tokenizedDocument(str)
documents = 
  2x1 tokenizedDocument:

    7 tokens: An example of a short sentence .
    5 tokens: A second short sentence .

documents = erasePunctuation(documents)
documents = 
  2x1 tokenizedDocument:

    6 tokens: An example of a short sentence
    4 tokens: A second short sentence

N = doclength(documents)
N = 2×1


Input Arguments

collapse all

Input documents, specified as a tokenizedDocument array.

Output Arguments

collapse all

Document lengths, returned as a vector of nonnegative integers. The size of N is the same as the size of documents.

Introduced in R2017b