Code Generation for Sequence-to-Sequence Classification with Learnables Compression

This example uses:

This example shows how to generate code and classify each time step of sequence data using a sequence-to-sequence long short-term memory (LSTM) network. A sequence-to-sequence LSTM network enables you to make different predictions for each individual time step of the sequence data. The example demonstrates how to generate a C library using both the single-precision floating point data type and brain floating-point data type (bfloat16). When learnables compression is enabled, learnables are stored in the bfloat format that greatly reduces memory usage. However, computation is still performed in single-precision so any hardware that supports single-precision floating-point datatype can use bfloat16, with no requirement of bfloat16 support from the processor. For information on bfloat16 format, see Generate bfloat16 Code for Deep Learning Networks.

This example uses sensor data obtained from a smartphone worn on the body. The example trains an LSTM network to recognize the activity of the wearer given time series data representing accelerometer readings in three different directions. The training data contains time series data for seven people. Each sequence has three features and varies in length. The data set contains six training observations and one test observation. For more information, see Sequence-to-Sequence Classification Using Deep Learning (Deep Learning Toolbox).

The `lstmnet_predict` Entry-Point Function

A sequence-to-sequence LSTM network enables you to make different predictions for each individual time step of a data sequence. The lstmnet_predict.m entry-point function takes an input sequence and passes it to a trained LSTM network for prediction. A dlarray object is created within the entry-point function, input and output of the function are of primitive datatypes. The entry-point function loads the dlnetwork object from the lstmnet.mat file into a persistent variable and reuses the persistent object on subsequent prediction calls. For more information, see Code Generation for dlarray (GPU Coder).

To display an interactive visualization of the network architecture and information about the network layers, use the analyzeNetwork (Deep Learning Toolbox) function.

type('lstmnet_predict.m')

function out = lstmnet_predict(in) %#codegen
% Copyright 2019-2024 The MathWorks, Inc. 
    dlIn = dlarray(in,'CT');
    persistent dlnet;
    
    if isempty(dlnet)
        dlnet = coder.loadDeepLearningNetwork('lstmnet.mat');
    end
    
    dlOut = predict(dlnet,dlIn); 
    
    out = extractdata(dlOut);
end

Generate SIL Library Without Network Compression

Create a coder.EmbeddedCodeConfig object cfg to generate a static C library. Set the VerificationMode property to 'SIL'. Create a deep learning configuration object that specifies that no target library is required and attach this deep learning configuration object to cfg. By default, generated code runs inference in 32-bit floats.

cfg = coder.config("lib");
cfg.VerificationMode = "SIL";
cfg.TargetLang = "C";
cfg.GenerateExampleMain = 'GenerateCodeAndCompile';
cfg.DeepLearningConfig = coder.DeepLearningConfig(TargetLibrary='none');

Specify the type and size of the input argument to the codegen command by using the coder.typeof function. For this example, the input is of the single data type with a feature dimension value of 3 and a variable sequence length. Specifying a variable sequence length enables prediction on an input sequence of any length.

matrixInput = coder.typeof(single(0),[3 Inf],[false true]);

cfg.DeepLearningConfig.LearnablesCompression = 'none';
codegen -config cfg lstmnet_predict -args {matrixInput} -d codegenFP32Sil -report

Code generation successful: View report

In your current working directory, the code generator creates a SIL executable lstmnet_predict_sil. It also creates an output folder codegenFP32Sil that contains the library. The lstmnet_predict library, which has a .lib, .a, or .so extension, is approximately 712 KB in size.

Perform Prediction on Test Data

The HumanActivityValidate MAT-file stores the variable XValidate that contains sample timeseries of sensor readings on which you can test the generated code. Load the MAT-file and cast the data to single for deployment. Call lstmnet_predict_sil on the first observation.

load HumanActivityValidate
XValidate = cellfun(@single, XValidate,  'UniformOutput',  false);
YPred1 = lstmnet_predict_sil(XValidate{1});

### Starting SIL execution for 'lstmnet_predict'
    To terminate execution: clear lstmnet_predict_sil

YPred1 is a 5-by-53,888 numeric matrix containing the probabilities of the five classes for each of the 53,888 time steps. For each time step, find the predicted class by calculating the index of the maximum probability.

[~, maxIndex] = max(YPred1, [], 1);

Associate the indices of maximum probability with the corresponding label. Display the first 10 labels. From the results, you can see that the network predicted the human to be sitting for the first 10 time steps.

labels = categorical({'Dancing','Running','Sitting','Standing','Walking'});
predictedLabels1 = labels(maxIndex);
disp(predictedLabels1(1:10)')

     Sitting 
     Sitting 
     Sitting 
     Sitting 
     Sitting 
     Sitting 
     Sitting 
     Sitting 
     Sitting 
     Sitting

Calculate the accuracy of the predictions.

acc1 = sum(predictedLabels1 == YValidate{1})./numel(YValidate{1})

acc1 = 0.9997

Generate SIL Library with Learnables Compression

Generate code that runs inference in bfloat16 precision. Compressing learnables from single-precision floating point format to bfloat16 format reduces the memory usage of deep learning networks with little change in accuracy. For more information, see Generate bfloat16 Code for Deep Learning Networks.

cfg = coder.config('lib');
cfg.VerificationMode = 'SIL';
cfg.DeepLearningConfig = coder.DeepLearningConfig(TargetLibrary='none');
cfg.DeepLearningConfig.LearnablesCompression = 'bfloat16';
cfg.TargetLang = "C";

matrixInput = coder.typeof(single(0),[3 Inf],[false true]);
codegen -config cfg lstmnet_predict -args {matrixInput} -d codegenBF16Sil -report

### Application stopped
### Stopping SIL execution for 'lstmnet_predict'
Code generation successful: View report

The code generator creates an output folder codegenBF16Sil that contains the library. The lstmnet_predict library, which has a .lib, .a., or .so extension, is approximately 390 KB in size. The size of generated library with learnables compression is about 45% smaller than the generated library without learnables compression. Notice the file sizes are obtained on a Windows system using Microsoft® Visual Studio® 2019. The actual file sizes may vary slightly across compilers and operating systems.

Run bfloat16 Prediction Test Data

Call lstmnet_predict_sil on the first observation.

YPred2 = lstmnet_predict_sil(XValidate{1});

### Starting SIL execution for 'lstmnet_predict'
    To terminate execution: clear lstmnet_predict_sil

YPred2 is a 5-by-53,888 numeric matrix containing the probabilities of the five classes for each of the 53,888 time steps. For each time step, find the predicted class by calculating the index of the maximum probability.

[~, maxIndex] = max(YPred2, [], 1);

labels = categorical({'Dancing','Running','Sitting','Standing','Walking'});
predictedLabels2 = labels(maxIndex);
disp(predictedLabels2(1:10)')

     Sitting 
     Sitting 
     Sitting 
     Sitting 
     Sitting 
     Sitting 
     Sitting 
     Sitting 
     Sitting 
     Sitting

Calculate the accuracy of the predictions.

acc2 = sum(predictedLabels2 == YValidate{1})./numel(YValidate{1})

acc2 = 0.9997

Compare Predictions with Test Data

Use a plot to compare the SIL outputs with the test data.

figure
subplot(3,1,1)
plot(YValidate{1},'--');
legend("Test Data",Location="southeast")

subplot(3,1,2)
plot(predictedLabels1,'r-');
legend("Predicted without Learnables Compression",Location="southeast")

subplot(3,1,3)
plot(predictedLabels2,'b-');
legend("Predicted with Learnables Compression",Location="southeast")

xlabel("Time Step")

Figure contains 3 axes objects. Axes object 1 contains an object of type line. This object represents Test Data. Axes object 2 contains an object of type line. This object represents Predicted without Learnables Compression. Axes object 3 with xlabel Time Step contains an object of type line. This object represents Predicted with Learnables Compression.

Code Generation for Sequence-to-Sequence Classification with Learnables Compression

The `lstmnet_predict` Entry-Point Function

Generate SIL Library Without Network Compression

Perform Prediction on Test Data

Generate SIL Library with Learnables Compression

Run bfloat16 Prediction Test Data

Compare Predictions with Test Data

See Also

Related Topics

Code Generation for Sequence-to-Sequence Classification with Learnables Compression

The lstmnet_predict Entry-Point Function

Generate SIL Library Without Network Compression

Perform Prediction on Test Data

Generate SIL Library with Learnables Compression

Run bfloat16 Prediction Test Data

Compare Predictions with Test Data

See Also

Related Topics

The `lstmnet_predict` Entry-Point Function