Code Generation for Convolutional LSTM Network That Uses Intel MKL-DNN

This example uses:

This example shows how to generate a MEX function for a deep learning network containing both convolutional and bidirectional long short-term memory (BiLSTM) layers that uses the Intel Math Kernel Library for Deep Neural Networks (MKL-DNN). The generated MEX function reads the data from a specified video file as a sequence of video frames and outputs a label that classifies the activity in the video. For more information on the training of this network, see the example Classify Videos Using Deep Learning (Deep Learning Toolbox).

Third-Party Prerequisites

Intel Math Kernel Library for Deep Neural Networks (MKL-DNN)
For a list of processors that support the MKL-DNN library, see MKLDNN CPU Support
For more information on the supported versions of the compilers and libraries, see Prerequisites for Deep Learning with MATLAB Coder

This example is supported on Mac®, Linux® and Windows® platforms and not supported for MATLAB Online.

Prepare Input

Read the video file pushup.mp4 by using the readvideo helper function included with this example in a supporting file. To view the video, loop over the individual frames of the video file and use the imshow function.

filename = "pushup.mp4";
video = readVideo(filename);
numFrames = size(video,4);
figure
for i = 1:numFrames
    frame = video(:,:,:,i);
    imshow(frame/255);
    drawnow
end

Figure contains an axes object. The axes object contains an object of type image.

Center-crop the input video frames to the input size of the trained network by using the centerCrop helper function attached as a supporting file.

inputSize = [224 224 3];
video = centerCrop(video,inputSize);

The `video_classify` Entry-Point Function

The video_classify.m entry-point function takes image sequences and passes it to a trained network for prediction. This function uses the convolutional LSTM network that is trained in the example Classify Videos Using Deep Learning (Deep Learning Toolbox). The function loads the network object from the file net.mat file into a persistent variable and then uses the classify (Deep Learning Toolbox) function to perform the prediction. On subsequent calls, the function reuses the already loaded persistent object.

type('video_classify.m')

function out = video_classify(in) %#codegen

% During the execution of the first function call, the network object is
% loaded in the persistent variable mynet. In subsequent calls, this loaded
% object is reused. 

persistent mynet;

if isempty(mynet)
    mynet = coder.loadDeepLearningNetwork('net.mat');
end

% Provide input and perform prediction
out = classify(mynet,in);

Generate MEX

To generate a MEX function, create a coder.MexCodeConfig object cfg. Set the TargetLang property of cfg to C++. Use the coder.DeepLearningConfig function to create a deep learning configuration object for MKL-DNN. Assign it to the DeepLearningConfig property of the cfg.

cfg = coder.config('mex');
cfg.TargetLang = 'C++';
cfg.DeepLearningConfig = coder.DeepLearningConfig('mkldnn');

Run the getVideoClassificationNetwork helper function to download the video classification network and save the network in the MAT file net.mat.

getVideoClassificationNetwork();

Use the coder.typeof function to specify the type and size of the input argument to the entry-point function. In this example, the input is of double type with size [224 224 3] and a variable sequence length.

Input = coder.typeof(double(0),[224 224 3 Inf],[false false false true]);

Generate a MEX function by running the codegen command.

codegen -config cfg video_classify -args {Input} -report

Code generation successful: View report

Run generated MEX

Run the generated MEX function with center-cropped video input.

output = video_classify_mex(video)

output = categorical
     pushup

Overlay the prediction on to the input video.

video = readVideo(filename);
numFrames = size(video,4);
figure
for i = 1:numFrames
    frame = video(:,:,:,i);
    frame = insertText(frame, [1 1], char(output), 'TextColor', [255 255 255],'FontSize',30, 'BoxColor', [0 0 0]);
    imshow(frame/255);
    drawnow
end

Figure contains an axes object. The axes object contains an object of type image.

Code Generation for Convolutional LSTM Network That Uses Intel MKL-DNN

Third-Party Prerequisites

Prepare Input

The `video_classify` Entry-Point Function

Generate MEX

Run generated MEX

See Also

Related Topics

Code Generation for Convolutional LSTM Network That Uses Intel MKL-DNN

Third-Party Prerequisites

Prepare Input

The video_classify Entry-Point Function

Generate MEX

Run generated MEX

See Also

Related Topics

The `video_classify` Entry-Point Function