export BERT to MATLAB: Load pre-trained BERT models

バージョン 2.0.3 (32.6 KB) 作成者: Moritz Scherrmann

This toolbox imports pre-trained BERT transformer models from Python and stores the models to be directly used in MATLAB.

フォロー

5.0

(3)

ダウンロード: 130

更新 2023/12/6

ライセンスの表示

Export BERT models from Python to Matlab for NLP

This toolbox exports pre-trained BERT transformer models from Python into Matlab and stores the models such that they can be directly used with the Mathworks' BERT implementation.

Note: Mathworks introduced a native BERT implementation in version R2023b. Starting from version 2.0.1 of this contribution, it is possible to use the BERT export directly with Mathworks native BERT implementation. Prior to that, it was necessary to download a transformer implementation in Matlab externally. Versions 1.x of this contribution are able to handle these cases.

This contribution allows for example to use a pre-trained German BERT model (like GBERT) from HuggingFace and use it directly for NLP applications, without having to rely on accessing Python from Matlab:

We have tested to export models from PyTorch and TensorFlow.
Pre-trained models for a downstream task are supported. This comprises text-classification (e.g. sentiment, multiclass or multilabel models), token-classification (e.g. for named entity recognition NER) and question-answering.
Models with a different structure then BERT (like Roberta etc.) are not supported.
Only models that use the word-piece tokenizer are currently supported.

The workflow comprises:

Install Python (we have only tested Python 3.9.x-3.11.x)
Generate an environment using the “bert2matlab.yml” provided in our “Python”-folder. This installs PyTorch, TensorFlow, and HuggingFace’s “transformers” libraries, to be able to import the pre-trained Python models. GPU support is not necessary.
A specific IDE is not necessary to export models, you can use the Python command line interface.
For example, these commands will export a plain, pre-trained German BERT model from HuggingFace, where the import syntax consists of the HuggingFace model name, the type of model (“none”, “text-classification”, “token-classification” or "question-answering"), and the model format (“tf” or “pt”):

# Open command window

cd "...\exportBertToMatlab\Python

python

# Plain pre-trained BERT model

from helperFunctions import modelToMatlab

modelToMatlab("bert-base-german-cased",None,"tf") #(tensorflow model)

For the Python syntax to import a model from HugingFace, see the included "MinimalExample.txt" file.
It is also possible to import own models. Simply provide the path to the model instead of a model name in Huggingface. As the internal model is loaded using the Python Transformers library with the function AutoModel.from_pretrained(), the same conditions for loading apply as for this function. It is also possible to use a separate tokenizer, either by using the tokenizer name from HUggingface or a path to an own tokenizer. However, the tokenizer must be compatible to the BERT model. This only makes sense if for example the path to the own BER model contains no tokenizer, such that it has to be provided separately.
Use the "readBertFromPython.m" function to load the model into Matlab and use powerful NLP.

By default, the models are imported into the subfolder “Models”, but you can optionally provide any path where to store the imported model.

We provide demo files to test the imported models both in Python as well as in Matlab (after import).

--------------------------------

Note that in principle it would also possible to import pre-trained models from the “sentence-transformers” library (sbert) developed by Nils Reimers to get high quality sentence/paragraph embeddings, but:

The sbert-embeddings are mostly based on the MPNet-BERT model with relative positional encoding, which would require a little modification of the Mathworks BERT-implementation.
Multilingual sentence embeddings rely on Byte-Pair-Encoding (BPE) for which we don’t have a compatible implementation.

We therefore refrained from adding sentence-transformers to this toolbox.

------------------------------------------------------

Disclaimer: We have tested to import several models, comprising plain BERT models and models with downstream tasks for NER, sentiment, multilabel classification etc. But we cannot guarantee that the code works for all BERT implementations.

The code is provided solely for illustrative purposes. We can neither guarantee that it works nor that it's free of errors, and do not take on any liability.

The code/toolbox is licensed under the BSD License.

引用

Moritz Scherrmann (2024). export BERT to MATLAB: Load pre-trained BERT models (https://www.mathworks.com/matlabcentral/fileexchange/125305-export-bert-to-matlab-load-pre-trained-bert-models), MATLAB Central File Exchange. November 21、2024に取得済み.

MATLAB リリースの互換性

作成: R2023b

R2023b と互換性あり

プラットフォームの互換性

Windows macOS Linux

タグタグを追加

謝辞

ヒントを得たファイル: Transformer Models

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

exportBertToMatlab/Matlab/Model_Tests

exportBertToMatlab/Matlab/helperFunctions

exportBertToMatlab/Matlab/helperFunctions/squadUtil

バージョン	公開済み	リリースノート
2.0.3	2023/12/6	- Update README	ダウンロード
2.0.2	2023/12/5	- Update of input structure of BERT heads for downstream tasks. Now, it is enough to pass the pooler weights (in case of sequence classification) and the task specific weights to the respective functions. - Update README	ダウンロード
1.0.4	2023/11/24	This version updates all codes to the new transformer version of Matlab 2023b.	ダウンロード
1.0.3	2023/4/28	Changed license in files	ダウンロード

export BERT to MATLAB: Load pre-trained BERT models

引用

必須

MATLAB リリースの互換性

プラットフォームの互換性

タグタグを追加

謝辞

Community Treasure Hunt

ライブエディターを体験する

exportBertToMatlab/Matlab/Model_Tests

exportBertToMatlab/Matlab/helperFunctions

exportBertToMatlab/Matlab/helperFunctions/squadUtil

export BERT to MATLAB: Load pre-trained BERT models

引用

必須

MATLAB リリースの互換性

プラットフォームの互換性

タグ タグを追加

謝辞

Community Treasure Hunt

ライブ エディターを体験する

exportBertToMatlab/Matlab/Model_Tests

exportBertToMatlab/Matlab/helperFunctions

exportBertToMatlab/Matlab/helperFunctions/squadUtil

タグタグを追加

ライブエディターを体験する