Main Content


Create quantized deep neural network



quantizedNetwork = quantize(quantObj) creates a quantized neural network object using a calibrated dlquantizer object specified as quantObj. Quantized neural network object, specified as quantizedNetwork enables visibility of the quantized layers, weights, and biases of the network, as well as quantized inference behavior.


quantizedNetwork = quantize(quantObj,Name,Value) creates a quantized neural network object using a calibrated dlquantizer object specified as quantObj with additional arguments specified by one or more name name-value pair arguments.


collapse all

This example shows how to use the quantize method to quantize a neural network in MATLAB.

Prepare Data

Load the pretrained and modified squeezenetmerch network.

load squeezenetmerch
net = 
  DAGNetwork with properties:

         Layers: [68×1 nnet.cnn.layer.Layer]
    Connections: [75×2 table]
     InputNames: {'data'}
    OutputNames: {'new_classoutput'}

Unzip the folder

imds = imageDatastore('MerchData', ...
    'IncludeSubfolders',true, ...
[calData, valData] = splitEachLabel(imds, 0.7, 'randomized');

The output size of the images are changed for both calibration and validation data according to network requirements.

aug_calData = augmentedImageDatastore([227 227], calData);
aug_valData = augmentedImageDatastore([227 227], valData);

Quantization of the Network

Create dlquantizer object for the network with execution environment as MATLAB. How the network is quantized depends on the execution environment.

quantObj = dlquantizer(net,'ExecutionEnvironment','MATLAB')
quantObj = 
  dlquantizer with properties:

           NetworkObject: [1×1 DAGNetwork]
    ExecutionEnvironment: 'MATLAB'

calResults = calibrate(quantObj,aug_calData);

The quantize method creates a quantizated network object that can view all the layers and network properties

qNet = quantize(quantObj)  
qNet = 
Quantized DAGNetwork with properties:

         Layers: [68×1 nnet.cnn.layer.Layer]
    Connections: [75×2 table]
     InputNames: {'data'}
    OutputNames: {'new_classoutput'}

Use the quantizationDetails method to extract quantization details.

Make Predictions Using Both Networks

predQuantized = classify(qNet,aug_valData);    % Predictions for the quantized network 
predOriginal = classify(net,aug_valData);  % Predictions for the non-quantized network 

Relative accuracy of the quantized network as compared to the original network

ccrQuantized = mean(predOriginal==predQuantized)*100
ccrQuantized = 100

For this validation dataset the quantized network gives 100% accuracy.

Input Arguments

collapse all

dlquantizer object containing the network to quantize, calibrated using the calibrate object function.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: quantizedNetwork = quantize(quantObj,'ExponentScheme','Histogram')

Specify exponent selection scheme for quantization as 'MinMax' or 'Histogram'. The MinMax scheme evaluates the exponent based on the range information in the calibration statistics and avoids for any overflows by the capturing the range. The Histogram scheme is a distribution based scaling which evaluates an exponent to best fit the calibration data.

Example: 'ExponentScheme', 'MinMax'

Output Arguments

collapse all

Quantized neural network specified as a DAGNetwork, SeriesNetwork, yolov2ObjectDetector (Computer Vision Toolbox), or a ssdObjectDetector (Computer Vision Toolbox) object.


  • For C/C++ and CUDA code generation, the software generates code for a convolutional deep neural network by quantizing the weights, biases, and activations of the convolution layers to 8-bit scaled integer data types. The quantization is performed by providing the calibration result file produced by the calibrate function to the codegen (MATLAB Coder) command.

    Code generation does not support quantized deep neural networks produced by the quantize function.

Version History

Introduced in R2022a