Text Analytics Toolbox

Import and Visualize Text

Import text data into MATLAB from single files or large collections of files, including PDF, HTML, and Microsoft® Word files. Visually explore text data sets using word clouds and text scatter plots.

Clean and Preprocess Text

Apply high-level filtering functions to remove extraneous content, such as URLs, HTML tags, and punctuation. Correct spelling, filter stop words, and normalize words to root form.

Convert Text to Structured Format

Extract linguistic features by using a tokenization algorithm, calculate word frequency statistics to represent text data numerically, and train word embedding models such as word2vec and skip-gram.

Apply AI to Text Analytics

Fit a machine learning or deep learning model, such as LSA, LDA, and LSTM, to text data. Leverage transformer models, such as BERT, FinBERT, and GPT-2, to perform transfer learning with text data.

Large Language Models

Connect MATLAB to the OpenAI™ Chat Completions API. Leverage the natural language processing capabilities of GPT models within your MATLAB environment, for tasks such as text summarization and chatting.

Text Analytics for Engineers

Develop predictive maintenance schedules based on sensors and text log data. Automate requirement formalization and compliance checking.

Document Analysis

Analyze text with topic modeling to discover and visualize underlying patterns, trends, and complex relationships. Summarize documents, extract keywords, and evaluate document importance and similarity.

Sentiment Analysis

Identify the attitudes and opinions expressed in text data to categorize statements as being positive, neutral, or negative. Build models that can predict sentiment in real time.

Text Generation and Classification

Use deep learning to generate new text based on observed text and to classify text descriptions with word embeddings that can identify categories.