This example shows how to create a word cloud from plain text by reading it into a string array, preprocessing it, and passing it to the
wordcloud function. If you have Text Analytics Toolbox™ installed, then you can create word clouds directly from string arrays. For more information, see
wordcloud (Text Analytics Toolbox) (Text Analytics Toolbox).
Read the text from Shakespeare's Sonnets with the
sonnets = fileread('sonnets.txt'); sonnets(1:135)
ans = 'THE SONNETS by William Shakespeare I From fairest creatures we desire increase, That thereby beauty's rose might never die,'
Convert the text to a string using the
string function. Then, split it on newline characters using the
sonnets = string(sonnets); sonnets = splitlines(sonnets); sonnets(10:14)
ans = 5x1 string " From fairest creatures we desire increase," " That thereby beauty's rose might never die," " But as the riper should by time decease," " His tender heir might bear his memory:" " But thou, contracted to thine own bright eyes,"
Replace some punctuation characters with spaces.
p = ["." "?" "!" "," ";" ":"]; sonnets = replace(sonnets,p," "); sonnets(10:14)
ans = 5x1 string " From fairest creatures we desire increase " " That thereby beauty's rose might never die " " But as the riper should by time decease " " His tender heir might bear his memory " " But thou contracted to thine own bright eyes "
sonnets into a string array whose elements contain individual words. To do this, join all the string elements into a 1-by-1 string and then split on the space characters.
sonnets = join(sonnets); sonnets = split(sonnets); sonnets(7:12)
ans = 6x1 string "From" "fairest" "creatures" "we" "desire" "increase"
Remove words with fewer than five characters.
sonnets(strlength(sonnets)<5) = ;
sonnets to a categorical array and then plot using
wordcloud. The function plots the unique elements of
C with sizes corresponding to their frequency counts.
C = categorical(sonnets); figure wordcloud(C); title("Sonnets Word Cloud")