Does the text analytics toolbox allow users to test out-of-sample perplexity with LDA?

1 回表示 (過去 30 日間)
I want to create two samples from my data: one for training and one for testing. Then I want to fit the LDA model using the training sample. Then I want to test the preplexity of the test sample using the fitted model. Is this possible with the text analytics toolbox?

採用された回答

Christopher Creutzig
Christopher Creutzig 2018 年 11 月 26 日
The second output of logp gives you the perplexity.
txt = extractFileText('sonnets.txt');
sonnets = split(txt,[newline newline]);
sonnets = sonnets(5:2:end);
td = tokenizedDocument(sonnets);
bow = bagOfWords(td(1:50));
mdl = fitlda(bow,5,'Verbose',0);
[~,perpl] = logp(mdl, encode(bow,td(51:53)))
% perpl = 337.4999
  2 件のコメント
Stephen Bruestle
Stephen Bruestle 2018 年 11 月 30 日
This answer convinced me to buy your product.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeText Analytics Toolbox についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by