This example shows how to use the Simulink® Support Package for Android™ Devices to deploy a deep learning algorithm that detects and tracks an object on your Android device such as a phone or tablet. This algorithm uses the ResNet-18-based YOLOv2 neural network to identify the object captured by the camera. You can experiment with different objects in your surroundings to see how accurately the network detects images on your Android device.
For more information on how to use the Simulink Support Package for Android Devices to run a Simulink model on your Android device, see Getting Started with Android™ Devices.
Download and install any ARM® Compute Library using the Hardware Setup screen. This example uses ARM Compute Library version 19.05. For more information on the Hardware Setup screen, see Install Support for Android Devices.
Android device such as a phone or tablet
Capture a video of an object that you want to detect and track. You can also follow the training procedure in this example by using a video of your choice. You can capture the video of an object for a longer duration of time in different viewing angles and lighting conditions to obtain a training data set that has better detection and identification results.
After you capture the video, transfer the MP4 file to your host machine.
The ground truth data contains information about the data source, label definitions, and marked label annotations for a set of ground truth labels. You can export this data using the Video Labeler app into a MAT file.
To open the Video Labeler app, run this command in the MATLAB® Command Window:
Follow these steps in the Video Labeler app:
1. In the File section, click Import.
2. Select Add Video and select the video of the object.
3. In the ROI Labels pane, click Label. Create a Rectangular label, name it, and click OK. In this example, the object has the name
4. Use the mouse to draw a rectangular ROI in the video.
5. In the Automate Labeling section, click Select Algorithm and select the
Point Tracker algorithm. Then click Automate. The algorithm instructions appear in the right pane, and the selected labels are available to automate.
6. In the Run section, click Run to automate labeling for the video.
7. When you are satisfied with the algorithm results, in the Close section, click Accept.
8. Under Export Labels, select To File to export the labeled data to a MAT file,
9. Save the
appledetect.mat file in the working directory of the example.
Train the YOLOv2 object detector with the video captured from the camera. The
appledetect.mat file contains the exported ground truth data. Use this file to train the YOLOv2 object detector.
deepresnet18.m file uses a pretrained ResNet-18 neural network as a base of YOLOv2 detection network for the feature extraction of an object. You can find this file in the example folder structure. Make sure that the deepresnet18.m file is present in the same working directory of the example. Open this file and configure the following parameters:
1. Specify the name of the MAT file exported using the Video Labeler app in the
labelData parameter. In this example, the MAT file is saved as
2. Specify the size of the input image for training the network in the
imagesize parameter. In this example, the image size is set to
[224, 224, 3].
3. Specify the number of object classes the network has to detect in the
numClasses parameter. In this example, the parameter is set to
1 to detect and track one apple.
4. Specify the pretrained ResNet-18 network layer as the base network for feature extraction of the object. In this example,
ResNet-18 is the base for the YOLOv2 object detector.
5. Specify the network layer to use for feature extraction. In this example, the ResNet-18 neural network extracts features from the
res3b_relu layer. This layer outputs 128 features and the activations have a spatial size of 28-by-28.
6. Specify the size of the anchor boxes in the
anchorBoxes field. In this example, the parameter is set to
7. Create the YOLOv2 object detection network using the
8. You can also analyze the YOLOv2 network architecture using the
analyzeNetwork function. The layers succeeding the feature layer are removed. A series of convolution, ReLU, and batch normalization layers along with the YOLOv2 transform and YOLOv2 output layers are added to the feature layer of the base network.
9. Configure the options for training the deep learning ResNet-18 neural network using the
10. After loading the
appledetect.mat file, create an image datastore and a box label datastore training data from the specified ground truth file using the
11. After combining the datastores, train the YOLOv2 network using the
12. After the YOLOv2 detector training is complete, save the MAT file. In this example, it is saved as
detectedresnet.mat. Save this MAT file in the current working directory of the example.
This example uses a preconfigured Simulink model from the Simulink Support Package for Android Devices.
To open the Simulink model, run this command in the MATLAB® Command Window.
1. Connect the Android device to the host computer using the USB cable.
2. On the Modeling tab of the Simulink toolstrip, select Model Settings.
3. In the Configuration Parameters dialog box, select Hardware Implementation. Verify that the Hardware board parameter is set to
4. From the Groups list under Target hardware resources, select Device options.
5. From the Device list, select your Android device. If your device is not listed, click Refresh.
Note: If your device is not listed even after you click Refresh, ensure that you have enabled the USB debugging option on your device. To enable USB debugging, enter
androidhwsetup in the MATLAB Command Window and follow the on-screen instructions.
6. In the Configuration Parameters dialog box, select Code Generation from the left pane and from the Target selection section, set Language to
7. Select Code Generation > Interface and in the Deep learning section, set these parameters:
a. Set Target library to
b. Select ARM Compute Library version based on the installation you chose in the Hardware Setup screen. In this example, it is set to
c. Set ARM Compute Library architecture to
8. Click Apply > OK.
The Android Camera block captures the video of the object using its rear camera. You can configure the following parameters in the Camera Block Parameters dialog box:
1. Set Resolution to
Back. To get a list of device specific resolutions, connect your configured device to the host machine and click Refresh.
2. Set the Sample time to
To open the RGB to Image subsystem, run this command in the MATLAB Command Window.
open_system('androidObjectClassification/RGB to Image');
The R, G, and B data received from the Android Camera block is first transposed from row major to column major. This transposed R, G, and B data is fed to the Matrix Concatenate block. This block concatenates the R, G, and B image data to create a contiguous output signal, Imin. You can configure the following parameters in the Vector Concatenate, Matrix Concatenate Block Parameters dialog box:
1. Set Number of inputs to
3. This value indicates the R, G, and B image data input.
2. Set Mode to
Multidimensional to perform multidimensional concatenation on the R, G, and B image data input.
3. Set Concatenate dimension to
3 to specify the output dimension along which to concatenate the input array of R, G, and B image data.
The deeplearning function block uses the YOLOv2-based convolutional neural network (CNN) saved as a MAT file. Pass Imin as an input to the detector network. If the object is detected, Imout contains the bounding box information of the detected object.
Pass the name of the MAT file generated from training the YOLOv2 object detector to the deeplearning function block. In this example, the MAT file is
The ImagetoRGB function block again transposes the image data to R, G, and B image values. These R, G, and B image data values are the inputs to the Android Video Display block in the Simulink model.
The Video Display block displays the video of the object on your Android device.
1. On the Hardware tab of the Simulink model, click Build, Deploy & Start. The androidObjectClassification application launches automatically.
2. Place the object in front of the Android device camera and move the object. Observe the bounding box with the label around the detected object.
3. Move the object and track it on your Android device.
Train the YOLOv2 object detector to detect and track more than one object.
Use a neural network other than ResNet-18 for training the objects and observe the differences in the obtained results.
Use a different algorithm in the Video Labeler app and compare the results with the
Point Tracker algorithm.
Change the input image size provided in the
deeplearning function and observe the object detection image.