Why do the values of learnables in a quantized dlnetwork still stored as float32(single precision)?

4 ビュー (過去 30 日間)
Even though the dlquantizer is quantizing the weights of the fully connected layer to int8 and bias of the layer to int32, why do I see in the quantized dlnetwork the values are still stored as float32(single precision)?
Also, I would like to find out if dlquantizer can quantize a particular layer or not?

採用された回答

MathWorks Fixed Point Team
MathWorks Fixed Point Team 2025 年 7 月 18 日
編集済み: MathWorks Fixed Point Team 2025 年 7 月 18 日
Yes, the learnables on the dlnetwork/quantized network are still stored as single precision.
Consider estimating parameter memory of the quantized network once deployed using the API: https://www.mathworks.com/help/deeplearning/ref/estimatenetworkmetrics.html.
The layers that it decided to quantize: https://www.mathworks.com/help/deeplearning/ug/supported-layers-for-quantization.html. It changes across releases and varies among intended targets.
The 'Analyze for Compression' feature (available in R2025a) in the Deep Network designer app -- it'll show you which layers in your network are supported for quantization, which can be friendlier than manually comparing to the supported layers doc page. It currently only analyzes for the MATLAB execution environment.

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeQuantization, Projection, and Pruning についてさらに検索

製品


リリース

R2025a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by