Main Content

Emulate Target Agnostic Quantized Network

Target agnostic quantization allows you to see the effect quantization has on your neural network without target hardware or target-specific quantization schemes. Creating a target agnostic quantized network is useful if you:

  • Do not have access to your target hardware.

  • Want to preview whether or not your network is suitable for quantization.

  • Find layers that are sensitive to quantization.

Quantized networks emulate quantized behavior for quantization compatible layers. Network architecture like layers and connections are the same as the original network but inference behavior uses limited precision types. Once you have quantized your network, you can retrieve details on what was quantized by using the quantizationDetails function.

Quantize a Neural Network with MATLAB

This example shows how to quantize a neural network by using MATLAB. This workflow allows you to look at your quantized code before generating code or deploying it to a specific target. The example uses the pretrained squeezenet convolutional neural network to demonstrate quantization.

Load the pretrained and modified squeezenet network.

load squeezenetmerch
net = 
  DAGNetwork with properties:

         Layers: [68×1 nnet.cnn.layer.Layer]
    Connections: [75×2 table]
     InputNames: {'data'}
    OutputNames: {'new_classoutput'}

Define calibration and validation data to use for quantization.

Use the calibrartion to collect the dynamic ranges of the weights and biases in the convolution and fully connected layers of the network and the dynamic ranges of the activations in all layers of the network. For the best quantization results, the calibration data must be representative of inputs to the network.

Use the validation to test the network after quantization to understand the effects of the limited range and precision of the quantized convolution layers in the network.

For this example, use the images in the MerchData data set. Define an augmentedImageDatastore object to resize the data for the network. Then, split the data into calibration and validation data sets.

imds = imageDatastore('MerchData', ...
    'IncludeSubfolders',true, ...
[calData, valData] = splitEachLabel(imds, 0.7, 'randomized');
aug_calData = augmentedImageDatastore([227 227], calData);
aug_valData = augmentedImageDatastore([227 227], valData);

Create a dlquantizer object with MATLAB as the execution environment and specify the network to quantize.

quantObj = dlquantizer(net,'ExecutionEnvironment','MATLAB');

Calibrate the network.

calResults = calibrate(quantObj, aug_calData);
qNet = quantize(quantObj)
qNet = 
Quantized DAGNetwork with properties:

         Layers: [68×1 nnet.cnn.layer.Layer]
    Connections: [75×2 table]
     InputNames: {'data'}
    OutputNames: {'new_classoutput'}

Use the quantizationDetails method to extract quantization details.

qDetails = quantizationDetails(qNet)
qDetails = struct with fields:
            IsQuantized: 1
          TargetLibrary: "none"
    QuantizedLayerNames: [26×1 string]
    QuantizedLearnables: [52×3 table]

ypred = qNet.classify(valData);
ccr = mean(ypred == valData.Labels)
ccr = 1

With this quantized layer there is no drop in accuracy.

See Also



Related Topics