writeWordEmbedding

単語埋め込みファイルの書き込み

構文

writeWordEmbedding(emb,filename)

説明

writeWordEmbedding(emb,filename) は、単語埋め込み emb をファイル filename に書き込みます。この関数は、ボキャブラリを UTF-8 の word2vec テキスト形式で書き込みます。

例

すべて折りたたむ

ファイルへの Word 埋め込みの書き込み

ライブスクリプトを開く

単語埋め込みに学習させ、テキストファイルに書き込みます。

サンプルデータを読み込みます。ファイル sonnetsPreprocessed.txt には、シェイクスピアのソネット集の前処理されたバージョンが格納されています。ファイルには、1 行に 1 つのソネットが含まれ、単語がスペースで区切られています。sonnetsPreprocessed.txt からテキストを抽出し、テキストを改行文字で文書に分割した後、文書をトークン化します。

filename = "sonnetsPreprocessed.txt";
str = extractFileText(filename);
textData = split(str,newline);
documents = tokenizedDocument(textData);

trainWordEmbedding を使用して単語埋め込みに学習させます。

emb = trainWordEmbedding(documents)

Training: 100% Loss: 3.07847  Remaining time: 0 hours 0 minutes.

emb = 
  wordEmbedding with properties:

     Dimension: 100
    Vocabulary: ["thy"    "thou"    "love"    "thee"    "doth"    "mine"    "shall"    "eyes"    "sweet"    "time"    "nor"    "beauty"    "yet"    "art"    "heart"    "o"    "thine"    "hath"    "fair"    "make"    "still"    …    ] (1×401 string)

単語埋め込みをテキストファイルに書き込みます。

filename = "exampleSonnetsEmbedding.vec";
writeWordEmbedding(emb,filename)

readWordEmbedding を使用して単語埋め込みファイルを読み取ります。

emb = readWordEmbedding(filename)

emb = 
  wordEmbedding with properties:

     Dimension: 100
    Vocabulary: ["thy"    "thou"    "love"    "thee"    "doth"    "mine"    "shall"    "eyes"    "sweet"    "time"    "nor"    "beauty"    "yet"    "art"    "heart"    "o"    "thine"    "hath"    "fair"    "make"    "still"    …    ] (1×401 string)

入力引数

すべて折りたたむ

`emb` — 入力単語埋め込み
`wordEmbedding` オブジェクト

入力単語埋め込み。wordEmbedding オブジェクトとして指定します。

`filename` — ファイルの名前
string スカラー | 文字ベクトル | 文字ベクトルを含む 1 行 1 列の cell 配列

ファイルの名前。string スカラー、文字ベクトル、または文字ベクトルを含む 1 行 1 列の cell 配列として指定します。

データ型: string | char | cell

バージョン履歴

R2017b で導入

参考

writeWordEmbedding

構文

説明

例

ファイルへの Word 埋め込みの書き込み

入力引数

emb — 入力単語埋め込み wordEmbedding オブジェクト

filename — ファイルの名前 string スカラー | 文字ベクトル | 文字ベクトルを含む 1 行 1 列の cell 配列

バージョン履歴

参考

トピック

`emb` — 入力単語埋め込み
`wordEmbedding` オブジェクト

`filename` — ファイルの名前
string スカラー | 文字ベクトル | 文字ベクトルを含む 1 行 1 列の cell 配列